Overview

Brought to you by YData

Dataset statistics

Number of variables60
Number of observations5019782
Missing cells166704518
Missing cells (%)55.3%
Total size in memory2.2 GiB
Average record size in memory480.0 B

Variable types

Text60

Dataset

DescriptionNaturalis Biodiversity Center (NL) - Botany 0061690-241126133413365
URLhttps://doi.org/10.15468/dl.4ze7ns

Alerts

license has constant value "CC0 1.0" Constant
rightsHolder has constant value "Naturalis Biodiversity Center" Constant
institutionID has constant value "https://ror.org/0566bfb96" Constant
collectionCode has constant value "Botany" Constant
basisOfRecord has constant value "PreservedSpecimen" Constant
samplingEffort has constant value "0.0 m" Constant
island has constant value "51.41942" Constant
countryCode has constant value "WGS84" Constant
maximumDistanceAboveSurfaceInMeters has constant value "Asia" Constant
geodeticDatum has constant value "WGS84" Constant
coordinateUncertaintyInMeters has constant value "South-Western" Constant
verbatimCoordinates has constant value "Siam [Thailand], Kwae Noi Basin Expedition, near Neeckey, near Wangka." Constant
verbatimSRS has constant value "150.0 m" Constant
geologicalContextID has constant value "15.1" Constant
earliestEonOrLowestEonothem has constant value "98.46667" Constant
latestEonOrHighestEonothem has constant value "WGS84" Constant
identificationVerificationStatus has constant value "Fungi-Ascomycota" Constant
identificationRemarks has constant value "Lichenes-Lecanoromycetes" Constant
namePublishedIn has constant value "species" Constant
subgenus has constant value "Fimbristylis bisumbellata (Forssk.) Bubani" Constant
vernacularName has constant value "Plantae" Constant
nomenclaturalCode has constant value "ICN" Constant
nomenclaturalStatus has constant value "Poales" Constant
otherCatalogNumbers has 3742862 (74.6%) missing values Missing
eventDate has 856201 (17.1%) missing values Missing
habitat has 4109448 (81.9%) missing values Missing
samplingEffort has 5019781 (> 99.9%) missing values Missing
continent has 902365 (18.0%) missing values Missing
island has 5019781 (> 99.9%) missing values Missing
countryCode has 5019780 (> 99.9%) missing values Missing
stateProvince has 3065004 (61.1%) missing values Missing
locality has 763737 (15.2%) missing values Missing
verbatimElevation has 3265629 (65.1%) missing values Missing
maximumDistanceAboveSurfaceInMeters has 5019781 (> 99.9%) missing values Missing
decimalLatitude has 2925885 (58.3%) missing values Missing
decimalLongitude has 2925885 (58.3%) missing values Missing
coordinateUncertaintyInMeters has 5019781 (> 99.9%) missing values Missing
verbatimCoordinates has 5019781 (> 99.9%) missing values Missing
verbatimSRS has 5019781 (> 99.9%) missing values Missing
geologicalContextID has 5019781 (> 99.9%) missing values Missing
earliestEonOrLowestEonothem has 5019781 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 5019781 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 5019780 (> 99.9%) missing values Missing
bed has 5019780 (> 99.9%) missing values Missing
typeStatus has 4932431 (98.3%) missing values Missing
identifiedBy has 4152104 (82.7%) missing values Missing
dateIdentified has 4581006 (91.3%) missing values Missing
identificationReferences has 5019780 (> 99.9%) missing values Missing
identificationVerificationStatus has 5019781 (> 99.9%) missing values Missing
identificationRemarks has 5019781 (> 99.9%) missing values Missing
taxonID has 5019780 (> 99.9%) missing values Missing
acceptedNameUsageID has 5019780 (> 99.9%) missing values Missing
namePublishedInID has 5019780 (> 99.9%) missing values Missing
parentNameUsage has 5019780 (> 99.9%) missing values Missing
namePublishedIn has 5019780 (> 99.9%) missing values Missing
phylum has 4742156 (94.5%) missing values Missing
class has 4741605 (94.5%) missing values Missing
order has 143842 (2.9%) missing values Missing
subgenus has 5019781 (> 99.9%) missing values Missing
specificEpithet has 420613 (8.4%) missing values Missing
infraspecificEpithet has 4607995 (91.8%) missing values Missing
scientificNameAuthorship has 355313 (7.1%) missing values Missing
vernacularName has 5019781 (> 99.9%) missing values Missing
nomenclaturalStatus has 5019781 (> 99.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-14 15:41:55.321161
Analysis finished2025-01-14 15:44:32.671775
Duration2 minutes and 37.35 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct5019782
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:35.179407image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters50197820
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5019782 ?
Unique (%)100.0%

Sample

1st row2514633172
2nd row2980371442
3rd row2514602651
4th row2980366433
5th row2514610075
ValueCountFrequency (%)
2514633172 1
 
< 0.1%
2980357438 1
 
< 0.1%
2516414075 1
 
< 0.1%
2980344448 1
 
< 0.1%
2516430099 1
 
< 0.1%
2980380439 1
 
< 0.1%
2516309267 1
 
< 0.1%
2980358429 1
 
< 0.1%
2514610078 1
 
< 0.1%
2516623054 1
 
< 0.1%
Other values (5019772) 5019772
> 99.9%
2025-01-14T10:44:37.598164image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 8927787
17.8%
2 7962993
15.9%
1 7915696
15.8%
4 4128324
8.2%
3 4110468
8.2%
6 4020457
8.0%
7 3910237
7.8%
0 3130209
 
6.2%
8 3047255
 
6.1%
9 3044394
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50197820
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 8927787
17.8%
2 7962993
15.9%
1 7915696
15.8%
4 4128324
8.2%
3 4110468
8.2%
6 4020457
8.0%
7 3910237
7.8%
0 3130209
 
6.2%
8 3047255
 
6.1%
9 3044394
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50197820
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 8927787
17.8%
2 7962993
15.9%
1 7915696
15.8%
4 4128324
8.2%
3 4110468
8.2%
6 4020457
8.0%
7 3910237
7.8%
0 3130209
 
6.2%
8 3047255
 
6.1%
9 3044394
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50197820
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 8927787
17.8%
2 7962993
15.9%
1 7915696
15.8%
4 4128324
8.2%
3 4110468
8.2%
6 4020457
8.0%
7 3910237
7.8%
0 3130209
 
6.2%
8 3047255
 
6.1%
9 3044394
 
6.1%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:37.656959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters35138474
Distinct characters5
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0 1.0
2nd rowCC0 1.0
3rd rowCC0 1.0
4th rowCC0 1.0
5th rowCC0 1.0
ValueCountFrequency (%)
cc0 5019782
50.0%
1.0 5019782
50.0%
2025-01-14T10:44:37.752894image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 10039564
28.6%
0 10039564
28.6%
5019782
14.3%
1 5019782
14.3%
. 5019782
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15059346
42.9%
Uppercase Letter 10039564
28.6%
Space Separator 5019782
 
14.3%
Other Punctuation 5019782
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 10039564
66.7%
1 5019782
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 10039564
100.0%
Space Separator
ValueCountFrequency (%)
5019782
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5019782
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25098910
71.4%
Latin 10039564
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 10039564
40.0%
5019782
20.0%
1 5019782
20.0%
. 5019782
20.0%
Latin
ValueCountFrequency (%)
C 10039564
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35138474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 10039564
28.6%
0 10039564
28.6%
5019782
14.3%
1 5019782
14.3%
. 5019782
14.3%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:37.801606image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters145573678
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNaturalis Biodiversity Center
2nd rowNaturalis Biodiversity Center
3rd rowNaturalis Biodiversity Center
4th rowNaturalis Biodiversity Center
5th rowNaturalis Biodiversity Center
ValueCountFrequency (%)
naturalis 5019782
33.3%
biodiversity 5019782
33.3%
center 5019782
33.3%
2025-01-14T10:44:37.902739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 20079128
13.8%
t 15059346
10.3%
r 15059346
10.3%
e 15059346
10.3%
10039564
 
6.9%
s 10039564
 
6.9%
a 10039564
 
6.9%
d 5019782
 
3.4%
C 5019782
 
3.4%
y 5019782
 
3.4%
Other values (7) 35138474
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 120474768
82.8%
Uppercase Letter 15059346
 
10.3%
Space Separator 10039564
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 20079128
16.7%
t 15059346
12.5%
r 15059346
12.5%
e 15059346
12.5%
s 10039564
8.3%
a 10039564
8.3%
d 5019782
 
4.2%
y 5019782
 
4.2%
v 5019782
 
4.2%
o 5019782
 
4.2%
Other values (3) 15059346
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 5019782
33.3%
N 5019782
33.3%
B 5019782
33.3%
Space Separator
ValueCountFrequency (%)
10039564
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 135534114
93.1%
Common 10039564
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 20079128
14.8%
t 15059346
11.1%
r 15059346
11.1%
e 15059346
11.1%
s 10039564
 
7.4%
a 10039564
 
7.4%
d 5019782
 
3.7%
C 5019782
 
3.7%
y 5019782
 
3.7%
v 5019782
 
3.7%
Other values (6) 30118692
22.2%
Common
ValueCountFrequency (%)
10039564
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 145573678
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 20079128
13.8%
t 15059346
10.3%
r 15059346
10.3%
e 15059346
10.3%
10039564
 
6.9%
s 10039564
 
6.9%
a 10039564
 
6.9%
d 5019782
 
3.4%
C 5019782
 
3.4%
y 5019782
 
3.4%
Other values (7) 35138474
24.1%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:37.955632image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters125494550
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://ror.org/0566bfb96
2nd rowhttps://ror.org/0566bfb96
3rd rowhttps://ror.org/0566bfb96
4th rowhttps://ror.org/0566bfb96
5th rowhttps://ror.org/0566bfb96
ValueCountFrequency (%)
https://ror.org/0566bfb96 5019782
100.0%
2025-01-14T10:44:38.059936image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 15059346
12.0%
r 15059346
12.0%
6 15059346
12.0%
t 10039564
 
8.0%
o 10039564
 
8.0%
b 10039564
 
8.0%
h 5019782
 
4.0%
p 5019782
 
4.0%
s 5019782
 
4.0%
: 5019782
 
4.0%
Other values (6) 30118692
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 70276948
56.0%
Decimal Number 30118692
24.0%
Other Punctuation 25098910
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 15059346
21.4%
t 10039564
14.3%
o 10039564
14.3%
b 10039564
14.3%
h 5019782
 
7.1%
p 5019782
 
7.1%
s 5019782
 
7.1%
g 5019782
 
7.1%
f 5019782
 
7.1%
Decimal Number
ValueCountFrequency (%)
6 15059346
50.0%
0 5019782
 
16.7%
5 5019782
 
16.7%
9 5019782
 
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 15059346
60.0%
: 5019782
 
20.0%
. 5019782
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 70276948
56.0%
Common 55217602
44.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 15059346
21.4%
t 10039564
14.3%
o 10039564
14.3%
b 10039564
14.3%
h 5019782
 
7.1%
p 5019782
 
7.1%
s 5019782
 
7.1%
g 5019782
 
7.1%
f 5019782
 
7.1%
Common
ValueCountFrequency (%)
/ 15059346
27.3%
6 15059346
27.3%
: 5019782
 
9.1%
. 5019782
 
9.1%
0 5019782
 
9.1%
5 5019782
 
9.1%
9 5019782
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 125494550
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 15059346
12.0%
r 15059346
12.0%
6 15059346
12.0%
t 10039564
 
8.0%
o 10039564
 
8.0%
b 10039564
 
8.0%
h 5019782
 
4.0%
p 5019782
 
4.0%
s 5019782
 
4.0%
: 5019782
 
4.0%
Other values (6) 30118692
24.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:38.102867image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters30118692
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBotany
2nd rowBotany
3rd rowBotany
4th rowBotany
5th rowBotany
ValueCountFrequency (%)
botany 5019782
100.0%
2025-01-14T10:44:38.196758image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 5019782
16.7%
o 5019782
16.7%
t 5019782
16.7%
a 5019782
16.7%
n 5019782
16.7%
y 5019782
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25098910
83.3%
Uppercase Letter 5019782
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 5019782
20.0%
t 5019782
20.0%
a 5019782
20.0%
n 5019782
20.0%
y 5019782
20.0%
Uppercase Letter
ValueCountFrequency (%)
B 5019782
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30118692
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 5019782
16.7%
o 5019782
16.7%
t 5019782
16.7%
a 5019782
16.7%
n 5019782
16.7%
y 5019782
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30118692
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 5019782
16.7%
o 5019782
16.7%
t 5019782
16.7%
a 5019782
16.7%
n 5019782
16.7%
y 5019782
16.7%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing785
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:44:38.245687image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length17
Min length17

Characters and Unicode

Total characters85322949
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 5018997
100.0%
2025-01-14T10:44:38.346907image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 25094985
29.4%
r 10037994
 
11.8%
P 5018997
 
5.9%
s 5018997
 
5.9%
v 5018997
 
5.9%
d 5018997
 
5.9%
S 5018997
 
5.9%
p 5018997
 
5.9%
c 5018997
 
5.9%
i 5018997
 
5.9%
Other values (2) 10037994
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 75284955
88.2%
Uppercase Letter 10037994
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 25094985
33.3%
r 10037994
 
13.3%
s 5018997
 
6.7%
v 5018997
 
6.7%
d 5018997
 
6.7%
p 5018997
 
6.7%
c 5018997
 
6.7%
i 5018997
 
6.7%
m 5018997
 
6.7%
n 5018997
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
P 5018997
50.0%
S 5018997
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 85322949
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 25094985
29.4%
r 10037994
 
11.8%
P 5018997
 
5.9%
s 5018997
 
5.9%
v 5018997
 
5.9%
d 5018997
 
5.9%
S 5018997
 
5.9%
p 5018997
 
5.9%
c 5018997
 
5.9%
i 5018997
 
5.9%
Other values (2) 10037994
 
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 85322949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 25094985
29.4%
r 10037994
 
11.8%
P 5018997
 
5.9%
s 5018997
 
5.9%
v 5018997
 
5.9%
d 5018997
 
5.9%
S 5018997
 
5.9%
p 5018997
 
5.9%
c 5018997
 
5.9%
i 5018997
 
5.9%
Other values (2) 10037994
 
11.8%

occurrenceID
Text

Unique 

Distinct5019782
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:40.725612image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length81
Median length61
Mean length61.70256258
Min length58

Characters and Unicode

Total characters309733413
Distinct characters65
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5019782 ?
Unique (%)100.0%

Sample

1st rowhttps://data.biodiversitydata.nl/naturalis/specimen/L.2851604
2nd rowhttps://data.biodiversitydata.nl/naturalis/specimen/L%20%200971472
3rd rowhttps://data.biodiversitydata.nl/naturalis/specimen/L.2851644
4th rowhttps://data.biodiversitydata.nl/naturalis/specimen/L%20%200971531
5th rowhttps://data.biodiversitydata.nl/naturalis/specimen/L.2851686
ValueCountFrequency (%)
https://data.biodiversitydata.nl/naturalis/specimen/wag.1226003 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/wag.1816421 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/wag0454007 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.4308389 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/wag0100360 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l%20%200981551 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l%20%200820195 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/wag.1250897 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.4434831 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.4373010 2
 
< 0.1%
Other values (5019737) 5019762
> 99.9%
2025-01-14T10:44:43.184743image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 30118729
 
9.7%
t 30118693
 
9.7%
/ 25098912
 
8.1%
i 25098910
 
8.1%
s 20079128
 
6.5%
n 15059347
 
4.9%
e 15059347
 
4.9%
d 15059346
 
4.9%
. 14670101
 
4.7%
l 10039580
 
3.2%
Other values (55) 109331320
35.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 220870604
71.3%
Other Punctuation 45484356
 
14.7%
Decimal Number 36335937
 
11.7%
Uppercase Letter 7042491
 
2.3%
Connector Punctuation 13
 
< 0.1%
Dash Punctuation 8
 
< 0.1%
Math Symbol 2
 
< 0.1%
Currency Symbol 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 3368280
47.8%
A 1011325
 
14.4%
G 896119
 
12.7%
W 896053
 
12.7%
U 640126
 
9.1%
M 115226
 
1.6%
D 115202
 
1.6%
N 21
 
< 0.1%
P 21
 
< 0.1%
S 21
 
< 0.1%
Other values (13) 97
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 30118729
13.6%
t 30118693
13.6%
i 25098910
11.4%
s 20079128
9.1%
n 15059347
 
6.8%
e 15059347
 
6.8%
d 15059346
 
6.8%
l 10039580
 
4.5%
r 10039564
 
4.5%
p 10039564
 
4.5%
Other values (10) 40158396
18.2%
Decimal Number
ValueCountFrequency (%)
1 5550526
15.3%
2 4766114
13.1%
0 4160945
11.5%
3 3919832
10.8%
4 3413591
9.4%
7 2998891
8.3%
5 2992564
8.2%
6 2885873
7.9%
9 2833494
7.8%
8 2814107
7.7%
Other Punctuation
ValueCountFrequency (%)
/ 25098912
55.2%
. 14670101
32.3%
: 5019783
 
11.0%
% 695511
 
1.5%
! 47
 
< 0.1%
' 1
 
< 0.1%
@ 1
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 13
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 227913095
73.6%
Common 81820318
 
26.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 30118729
13.2%
t 30118693
13.2%
i 25098910
11.0%
s 20079128
8.8%
n 15059347
 
6.6%
e 15059347
 
6.6%
d 15059346
 
6.6%
l 10039580
 
4.4%
r 10039564
 
4.4%
p 10039564
 
4.4%
Other values (33) 47200887
20.7%
Common
ValueCountFrequency (%)
/ 25098912
30.7%
. 14670101
17.9%
1 5550526
 
6.8%
: 5019783
 
6.1%
2 4766114
 
5.8%
0 4160945
 
5.1%
3 3919832
 
4.8%
4 3413591
 
4.2%
7 2998891
 
3.7%
5 2992564
 
3.7%
Other values (12) 9229059
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 309733413
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 30118729
 
9.7%
t 30118693
 
9.7%
/ 25098912
 
8.1%
i 25098910
 
8.1%
s 20079128
 
6.5%
n 15059347
 
4.9%
e 15059347
 
4.9%
d 15059346
 
4.9%
. 14670101
 
4.7%
l 10039580
 
3.2%
Other values (55) 109331320
35.3%

catalogNumber
Text

Unique 

Distinct5019782
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size38.3 MiB
2025-01-14T10:44:45.535774image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length9
Mean length9.425454532
Min length6

Characters and Unicode

Total characters47313727
Distinct characters58
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5019782 ?
Unique (%)100.0%

Sample

1st rowL.2851604
2nd rowL 0971472
3rd rowL.2851644
4th rowL 0971531
5th rowL.2851686
ValueCountFrequency (%)
l 285081
 
5.3%
u 62704
 
1.2%
04 7
 
< 0.1%
0012538 3
 
< 0.1%
3
 
< 0.1%
0228872 2
 
< 0.1%
0004574 2
 
< 0.1%
0229129 2
 
< 0.1%
0004635 2
 
< 0.1%
0256210 2
 
< 0.1%
Other values (4994407) 5019794
93.5%
2025-01-14T10:44:48.067808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5550526
11.7%
. 4630537
9.8%
2 4070617
8.6%
3 3919831
 
8.3%
0 3465436
 
7.3%
4 3413591
 
7.2%
L 3368280
 
7.1%
7 2998891
 
6.3%
5 2992563
 
6.3%
6 2885861
 
6.1%
Other values (48) 10017594
21.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34944917
73.9%
Uppercase Letter 7042489
 
14.9%
Other Punctuation 4630591
 
9.8%
Space Separator 695497
 
1.5%
Lowercase Letter 196
 
< 0.1%
Connector Punctuation 13
 
< 0.1%
Modifier Symbol 12
 
< 0.1%
Dash Punctuation 8
 
< 0.1%
Math Symbol 2
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 3368280
47.8%
A 1011325
 
14.4%
G 896119
 
12.7%
W 896053
 
12.7%
U 640126
 
9.1%
M 115226
 
1.6%
D 115202
 
1.6%
P 21
 
< 0.1%
S 21
 
< 0.1%
N 21
 
< 0.1%
Other values (13) 95
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 5550526
15.9%
2 4070617
11.6%
3 3919831
11.2%
0 3465436
9.9%
4 3413591
9.8%
7 2998891
8.6%
5 2992563
8.6%
6 2885861
8.3%
9 2833494
8.1%
8 2814107
8.1%
Lowercase Letter
ValueCountFrequency (%)
w 93
47.4%
g 43
21.9%
a 37
 
18.9%
l 16
 
8.2%
o 2
 
1.0%
u 1
 
0.5%
e 1
 
0.5%
t 1
 
0.5%
n 1
 
0.5%
v 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 4630537
> 99.9%
! 47
 
< 0.1%
/ 2
 
< 0.1%
' 1
 
< 0.1%
: 1
 
< 0.1%
\ 1
 
< 0.1%
@ 1
 
< 0.1%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
695497
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 13
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 12
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40271042
85.1%
Latin 7042685
 
14.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 3368280
47.8%
A 1011325
 
14.4%
G 896119
 
12.7%
W 896053
 
12.7%
U 640126
 
9.1%
M 115226
 
1.6%
D 115202
 
1.6%
w 93
 
< 0.1%
g 43
 
< 0.1%
a 37
 
< 0.1%
Other values (23) 181
 
< 0.1%
Common
ValueCountFrequency (%)
1 5550526
13.8%
. 4630537
11.5%
2 4070617
10.1%
3 3919831
9.7%
0 3465436
8.6%
4 3413591
8.5%
7 2998891
7.4%
5 2992563
7.4%
6 2885861
7.2%
9 2833494
7.0%
Other values (15) 3509695
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47313727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5550526
11.7%
. 4630537
9.8%
2 4070617
8.6%
3 3919831
 
8.3%
0 3465436
 
7.3%
4 3413591
 
7.2%
L 3368280
 
7.1%
7 2998891
 
6.3%
5 2992563
 
6.3%
6 2885861
 
6.1%
Other values (48) 10017594
21.2%
Distinct2852768
Distinct (%)56.8%
Missing1
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:44:48.352886image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length121
Median length104
Mean length21.23713803
Min length1

Characters and Unicode

Total characters106605782
Distinct characters139
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2358777 ?
Unique (%)47.0%

Sample

1st rowUnknown s.n.
2nd rowZainoeddin bb 17357
3rd rowWijk, JH van s.n.
4th rowUnknown bb 17412
5th rowKoster, JT 6255
ValueCountFrequency (%)
s.n 1517120
 
7.6%
unknown 403748
 
2.0%
van 402082
 
2.0%
de 306350
 
1.5%
a 267054
 
1.3%
j 265883
 
1.3%
m 160895
 
0.8%
h 141882
 
0.7%
p 138822
 
0.7%
c 138103
 
0.7%
Other values (172568) 16227490
81.3%
2025-01-14T10:44:48.715787image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14949623
 
14.0%
n 6590984
 
6.2%
e 6135300
 
5.8%
, 5592121
 
5.2%
a 4397488
 
4.1%
s 3844877
 
3.6%
r 3532381
 
3.3%
o 3476858
 
3.3%
. 3240362
 
3.0%
i 2918794
 
2.7%
Other values (129) 51926994
48.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47775564
44.8%
Uppercase Letter 18878006
 
17.7%
Space Separator 14949650
 
14.0%
Decimal Number 13997887
 
13.1%
Other Punctuation 10496024
 
9.8%
Dash Punctuation 387395
 
0.4%
Open Punctuation 59284
 
0.1%
Close Punctuation 59257
 
0.1%
Connector Punctuation 2158
 
< 0.1%
Math Symbol 316
 
< 0.1%
Other values (6) 241
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 6590984
13.8%
e 6135300
12.8%
a 4397488
9.2%
s 3844877
 
8.0%
r 3532381
 
7.4%
o 3476858
 
7.3%
i 2918794
 
6.1%
l 2230606
 
4.7%
t 2053581
 
4.3%
d 1674628
 
3.5%
Other values (46) 10920067
22.9%
Uppercase Letter
ValueCountFrequency (%)
J 1947721
 
10.3%
H 1300682
 
6.9%
A 1296584
 
6.9%
S 1281050
 
6.8%
B 1243337
 
6.6%
M 1192210
 
6.3%
C 1016865
 
5.4%
W 948390
 
5.0%
P 926499
 
4.9%
R 844491
 
4.5%
Other values (28) 6880177
36.4%
Other Punctuation
ValueCountFrequency (%)
, 5592121
53.3%
. 3240362
30.9%
; 1574515
 
15.0%
/ 48042
 
0.5%
' 30407
 
0.3%
! 6979
 
0.1%
: 2233
 
< 0.1%
? 666
 
< 0.1%
\ 277
 
< 0.1%
& 192
 
< 0.1%
Other values (6) 230
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 2197925
15.7%
2 1669670
11.9%
3 1475003
10.5%
4 1361893
9.7%
5 1307307
9.3%
6 1259786
9.0%
7 1213233
8.7%
0 1176749
8.4%
8 1175255
8.4%
9 1161066
8.3%
Open Punctuation
ValueCountFrequency (%)
( 57600
97.2%
[ 1683
 
2.8%
{ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
14949623
> 99.9%
  27
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 57575
97.2%
] 1682
 
2.8%
Math Symbol
ValueCountFrequency (%)
+ 286
90.5%
= 30
 
9.5%
Other Number
ValueCountFrequency (%)
½ 81
95.3%
¼ 4
 
4.7%
Other Letter
ValueCountFrequency (%)
ª 6
75.0%
º 2
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 387395
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2158
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 145
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 66653578
62.5%
Common 39952204
37.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 6590984
 
9.9%
e 6135300
 
9.2%
a 4397488
 
6.6%
s 3844877
 
5.8%
r 3532381
 
5.3%
o 3476858
 
5.2%
i 2918794
 
4.4%
l 2230606
 
3.3%
t 2053581
 
3.1%
J 1947721
 
2.9%
Other values (86) 29524988
44.3%
Common
ValueCountFrequency (%)
14949623
37.4%
, 5592121
 
14.0%
. 3240362
 
8.1%
1 2197925
 
5.5%
2 1669670
 
4.2%
; 1574515
 
3.9%
3 1475003
 
3.7%
4 1361893
 
3.4%
5 1307307
 
3.3%
6 1259786
 
3.2%
Other values (33) 5323999
 
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106364932
99.8%
None 240848
 
0.2%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14949623
 
14.1%
n 6590984
 
6.2%
e 6135300
 
5.8%
, 5592121
 
5.3%
a 4397488
 
4.1%
s 3844877
 
3.6%
r 3532381
 
3.3%
o 3476858
 
3.3%
. 3240362
 
3.0%
i 2918794
 
2.7%
Other values (77) 51686144
48.6%
None
ValueCountFrequency (%)
é 66202
27.5%
ü 43282
18.0%
ö 22422
 
9.3%
á 20043
 
8.3%
è 16968
 
7.0%
í 12114
 
5.0%
ñ 10827
 
4.5%
ó 8648
 
3.6%
ß 8310
 
3.5%
ë 5441
 
2.3%
Other values (40) 26591
11.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct101508
Distinct (%)2.0%
Missing11448
Missing (%)0.2%
Memory size38.3 MiB
2025-01-14T10:44:48.926626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length108
Median length96
Mean length14.60004524
Min length1

Characters and Unicode

Total characters73121903
Distinct characters125
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37149 ?
Unique (%)0.7%

Sample

1st rowUnknown
2nd rowZainoeddin
3rd rowWijk JH van
4th rowUnknown
5th rowKoster JT
ValueCountFrequency (%)
unknown 403744
 
2.9%
van 402079
 
2.9%
de 306325
 
2.2%
j 264889
 
1.9%
a 210805
 
1.5%
m 155481
 
1.1%
al 137949
 
1.0%
h 137879
 
1.0%
r 135106
 
1.0%
p 133204
 
0.9%
Other values (40914) 11811523
83.8%
2025-01-14T10:44:49.213481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9090878
 
12.4%
e 6123944
 
8.4%
n 5043429
 
6.9%
a 4353925
 
6.0%
r 3514483
 
4.8%
o 3466711
 
4.7%
i 2903594
 
4.0%
s 2322445
 
3.2%
l 2220840
 
3.0%
t 2048480
 
2.8%
Other values (115) 32033174
43.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44430714
60.8%
Uppercase Letter 17477628
 
23.9%
Space Separator 9090903
 
12.4%
Other Punctuation 1761774
 
2.4%
Dash Punctuation 261926
 
0.4%
Decimal Number 45250
 
0.1%
Open Punctuation 25796
 
< 0.1%
Close Punctuation 25788
 
< 0.1%
Connector Punctuation 2002
 
< 0.1%
Math Symbol 119
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6123944
13.8%
n 5043429
11.4%
a 4353925
9.8%
r 3514483
 
7.9%
o 3466711
 
7.8%
i 2903594
 
6.5%
s 2322445
 
5.2%
l 2220840
 
5.0%
t 2048480
 
4.6%
d 1649098
 
3.7%
Other values (46) 10783765
24.3%
Uppercase Letter
ValueCountFrequency (%)
J 1943668
 
11.1%
H 1259497
 
7.2%
M 1172359
 
6.7%
A 1164819
 
6.7%
B 1112442
 
6.4%
S 1108388
 
6.3%
C 988237
 
5.7%
W 913411
 
5.2%
R 795966
 
4.6%
P 782981
 
4.5%
Other values (28) 6235860
35.7%
Other Punctuation
ValueCountFrequency (%)
; 1574505
89.4%
. 156410
 
8.9%
' 29989
 
1.7%
? 412
 
< 0.1%
/ 271
 
< 0.1%
& 104
 
< 0.1%
! 49
 
< 0.1%
¡ 30
 
< 0.1%
: 3
 
< 0.1%
1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 9720
21.5%
9 8640
19.1%
6 5668
12.5%
7 5037
11.1%
4 4020
8.9%
8 3781
 
8.4%
0 2227
 
4.9%
3 2127
 
4.7%
2 2056
 
4.5%
5 1974
 
4.4%
Space Separator
ValueCountFrequency (%)
9090878
> 99.9%
  25
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 100
84.0%
= 19
 
16.0%
Dash Punctuation
ValueCountFrequency (%)
- 261926
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25796
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25788
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2002
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 61908342
84.7%
Common 11213561
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 6123944
 
9.9%
n 5043429
 
8.1%
a 4353925
 
7.0%
r 3514483
 
5.7%
o 3466711
 
5.6%
i 2903594
 
4.7%
s 2322445
 
3.8%
l 2220840
 
3.6%
t 2048480
 
3.3%
J 1943668
 
3.1%
Other values (84) 27966823
45.2%
Common
ValueCountFrequency (%)
9090878
81.1%
; 1574505
 
14.0%
- 261926
 
2.3%
. 156410
 
1.4%
' 29989
 
0.3%
( 25796
 
0.2%
) 25788
 
0.2%
1 9720
 
0.1%
9 8640
 
0.1%
6 5668
 
0.1%
Other values (21) 24241
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72889532
99.7%
None 232369
 
0.3%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9090878
 
12.5%
e 6123944
 
8.4%
n 5043429
 
6.9%
a 4353925
 
6.0%
r 3514483
 
4.8%
o 3466711
 
4.8%
i 2903594
 
4.0%
s 2322445
 
3.2%
l 2220840
 
3.0%
t 2048480
 
2.8%
Other values (68) 31800803
43.6%
None
ValueCountFrequency (%)
é 66202
28.5%
ü 43282
18.6%
ö 22417
 
9.6%
á 20043
 
8.6%
è 16968
 
7.3%
í 12114
 
5.2%
ñ 10827
 
4.7%
ó 8648
 
3.7%
ë 5441
 
2.3%
ä 4932
 
2.1%
Other values (35) 21495
 
9.3%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

otherCatalogNumbers
Text

Missing 

Distinct1247855
Distinct (%)97.7%
Missing3742862
Missing (%)74.6%
Memory size38.3 MiB
2025-01-14T10:44:50.034583image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length10
Mean length11.00010729
Min length1

Characters and Unicode

Total characters14046257
Distinct characters79
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1228633 ?
Unique (%)96.2%

Sample

1st rowL 0215467
2nd rowL 0215532
3rd rowL 0204325
4th rowL 0542724
5th rowL 0973113
ValueCountFrequency (%)
l 605823
25.3%
176059
 
7.3%
u 146244
 
6.1%
uw 6074
 
0.3%
b 4013
 
0.2%
a 2407
 
0.1%
0 681
 
< 0.1%
k 377
 
< 0.1%
jan.99 305
 
< 0.1%
okt.00 265
 
< 0.1%
Other values (1309143) 1457050
60.7%
2025-01-14T10:44:50.871028image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2090566
14.9%
1874427
13.3%
1 1087243
 
7.7%
2 1003157
 
7.1%
3 926296
 
6.6%
9 902389
 
6.4%
4 845700
 
6.0%
5 836296
 
6.0%
8 818449
 
5.8%
6 793280
 
5.6%
Other values (69) 2868454
20.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10037815
71.5%
Space Separator 1874427
 
13.3%
Uppercase Letter 1854729
 
13.2%
Math Symbol 176021
 
1.3%
Lowercase Letter 95261
 
0.7%
Other Punctuation 7322
 
0.1%
Dash Punctuation 645
 
< 0.1%
Modifier Symbol 30
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 606318
32.7%
A 335894
18.1%
W 323267
17.4%
G 322647
17.4%
U 196893
 
10.6%
D 15749
 
0.8%
M 9666
 
0.5%
F 6715
 
0.4%
O 5973
 
0.3%
H 5717
 
0.3%
Other values (16) 25890
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
w 67201
70.5%
e 2900
 
3.0%
u 2872
 
3.0%
i 2400
 
2.5%
a 2331
 
2.4%
n 1916
 
2.0%
j 1796
 
1.9%
p 1596
 
1.7%
t 1593
 
1.7%
l 1545
 
1.6%
Other values (15) 9111
 
9.6%
Decimal Number
ValueCountFrequency (%)
0 2090566
20.8%
1 1087243
10.8%
2 1003157
10.0%
3 926296
9.2%
9 902389
9.0%
4 845700
8.4%
5 836296
8.3%
8 818449
 
8.2%
6 793280
 
7.9%
7 734439
 
7.3%
Other Punctuation
ValueCountFrequency (%)
. 6475
88.4%
: 671
 
9.2%
/ 151
 
2.1%
? 7
 
0.1%
! 5
 
0.1%
, 4
 
0.1%
* 4
 
0.1%
\ 3
 
< 0.1%
' 2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
| 176015
> 99.9%
+ 6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 2
50.0%
) 2
50.0%
Space Separator
ValueCountFrequency (%)
1874427
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 645
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12096267
86.1%
Latin 1949990
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 606318
31.1%
A 335894
17.2%
W 323267
16.6%
G 322647
16.5%
U 196893
 
10.1%
w 67201
 
3.4%
D 15749
 
0.8%
M 9666
 
0.5%
F 6715
 
0.3%
O 5973
 
0.3%
Other values (41) 59667
 
3.1%
Common
ValueCountFrequency (%)
0 2090566
17.3%
1874427
15.5%
1 1087243
9.0%
2 1003157
8.3%
3 926296
7.7%
9 902389
7.5%
4 845700
7.0%
5 836296
6.9%
8 818449
 
6.8%
6 793280
 
6.6%
Other values (18) 918464
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14046257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2090566
14.9%
1874427
13.3%
1 1087243
 
7.7%
2 1003157
 
7.1%
3 926296
 
6.6%
9 902389
 
6.4%
4 845700
 
6.0%
5 836296
 
6.0%
8 818449
 
5.8%
6 793280
 
5.6%
Other values (69) 2868454
20.4%

eventDate
Text

Missing 

Distinct67961
Distinct (%)1.6%
Missing856201
Missing (%)17.1%
Memory size38.3 MiB
2025-01-14T10:44:51.016299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length11.80088967
Min length10

Characters and Unicode

Total characters49133960
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6056 ?
Unique (%)0.1%

Sample

1st row1933-04-24
2nd row1956-05-14
3rd row1939-05-21
4th row1955-04-26
5th row1838-05-01/1838-05-31
ValueCountFrequency (%)
1859-01-01/1859-12-31 5064
 
0.1%
1857-01-01/1857-12-31 3575
 
0.1%
1898-01-01/1898-12-31 3352
 
0.1%
1922-10-01/1922-10-31 2927
 
0.1%
1912-01-01/1912-12-31 2915
 
0.1%
1840-01-01/1840-12-31 2864
 
0.1%
1880-01-01/1880-12-31 2677
 
0.1%
1893-01-01/1893-12-31 2625
 
0.1%
1909-01-01/1909-12-31 2617
 
0.1%
1900-01-01/1900-12-31 2597
 
0.1%
Other values (67951) 4132368
99.3%
2025-01-14T10:44:51.234448image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 9868786
20.1%
- 9690462
19.7%
0 7477809
15.2%
9 5666608
11.5%
2 3122979
 
6.4%
8 2608202
 
5.3%
3 2323651
 
4.7%
6 2166097
 
4.4%
7 2156107
 
4.4%
5 1905253
 
3.9%
Other values (2) 2148006
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 38761848
78.9%
Dash Punctuation 9690462
 
19.7%
Other Punctuation 681650
 
1.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 9868786
25.5%
0 7477809
19.3%
9 5666608
14.6%
2 3122979
 
8.1%
8 2608202
 
6.7%
3 2323651
 
6.0%
6 2166097
 
5.6%
7 2156107
 
5.6%
5 1905253
 
4.9%
4 1466356
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 9690462
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 681650
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49133960
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 9868786
20.1%
- 9690462
19.7%
0 7477809
15.2%
9 5666608
11.5%
2 3122979
 
6.4%
8 2608202
 
5.3%
3 2323651
 
4.7%
6 2166097
 
4.4%
7 2156107
 
4.4%
5 1905253
 
3.9%
Other values (2) 2148006
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49133960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 9868786
20.1%
- 9690462
19.7%
0 7477809
15.2%
9 5666608
11.5%
2 3122979
 
6.4%
8 2608202
 
5.3%
3 2323651
 
4.7%
6 2166097
 
4.4%
7 2156107
 
4.4%
5 1905253
 
3.9%
Other values (2) 2148006
 
4.4%

habitat
Text

Missing 

Distinct339001
Distinct (%)37.2%
Missing4109448
Missing (%)81.9%
Memory size38.3 MiB
2025-01-14T10:44:51.465548image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24282
Median length668
Mean length39.6989929
Min length1

Characters and Unicode

Total characters36139343
Distinct characters168
Distinct categories18 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique233613 ?
Unique (%)25.7%

Sample

1st rowOld forest
2nd rowOld forest Very scanty
3rd rowOld forest, steep ridge
4th rowOld forest, clayey soil, sloping country, scanty
5th rowDegrade forest
ValueCountFrequency (%)
forest 416697
 
7.7%
in 196774
 
3.7%
on 168551
 
3.1%
of 89312
 
1.7%
soil 89138
 
1.7%
primary 81686
 
1.5%
with 73747
 
1.4%
secondary 71636
 
1.3%
the 66144
 
1.2%
and 64092
 
1.2%
Other values (93782) 4061990
75.5%
2025-01-14T10:44:51.772692image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4478314
 
12.4%
e 3514056
 
9.7%
r 2590696
 
7.2%
a 2510661
 
6.9%
o 2468359
 
6.8%
n 2068334
 
5.7%
s 1995921
 
5.5%
t 1847880
 
5.1%
i 1812691
 
5.0%
d 1411076
 
3.9%
Other values (158) 11441355
31.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28588916
79.1%
Space Separator 4478314
 
12.4%
Other Punctuation 1470486
 
4.1%
Uppercase Letter 1324005
 
3.7%
Decimal Number 116345
 
0.3%
Dash Punctuation 77829
 
0.2%
Open Punctuation 30001
 
0.1%
Close Punctuation 29898
 
0.1%
Control 12008
 
< 0.1%
Math Symbol 10534
 
< 0.1%
Other values (8) 1007
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3514056
12.3%
r 2590696
 
9.1%
a 2510661
 
8.8%
o 2468359
 
8.6%
n 2068334
 
7.2%
s 1995921
 
7.0%
t 1847880
 
6.5%
i 1812691
 
6.3%
d 1411076
 
4.9%
l 1409280
 
4.9%
Other values (47) 6959962
24.3%
Uppercase Letter
ValueCountFrequency (%)
S 172219
13.0%
O 129329
 
9.8%
P 103042
 
7.8%
F 82210
 
6.2%
I 81943
 
6.2%
R 75431
 
5.7%
A 72320
 
5.5%
D 69603
 
5.3%
C 68369
 
5.2%
M 67914
 
5.1%
Other values (34) 401625
30.3%
Other Punctuation
ValueCountFrequency (%)
. 943776
64.2%
, 401980
27.3%
; 77327
 
5.3%
' 17235
 
1.2%
/ 11370
 
0.8%
: 7625
 
0.5%
& 3770
 
0.3%
? 3374
 
0.2%
" 2368
 
0.2%
% 914
 
0.1%
Other values (9) 747
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 31850
27.4%
1 18139
15.6%
5 15491
13.3%
2 14398
12.4%
3 10095
 
8.7%
4 8421
 
7.2%
9 4806
 
4.1%
6 4747
 
4.1%
8 4630
 
4.0%
7 3768
 
3.2%
Math Symbol
ValueCountFrequency (%)
+ 8421
79.9%
± 961
 
9.1%
= 459
 
4.4%
> 261
 
2.5%
< 223
 
2.1%
| 157
 
1.5%
~ 47
 
0.4%
× 4
 
< 0.1%
÷ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 27667
92.2%
[ 2321
 
7.7%
{ 8
 
< 0.1%
5
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
` 26
66.7%
´ 8
 
20.5%
^ 4
 
10.3%
¨ 1
 
2.6%
Close Punctuation
ValueCountFrequency (%)
) 27591
92.3%
] 2299
 
7.7%
} 8
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 148
70.5%
² 60
28.6%
¼ 2
 
1.0%
Final Punctuation
ValueCountFrequency (%)
17
89.5%
1
 
5.3%
» 1
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 77753
99.9%
76
 
0.1%
Control
ValueCountFrequency (%)
11945
99.5%
63
 
0.5%
Initial Punctuation
ValueCountFrequency (%)
17
94.4%
« 1
 
5.6%
Currency Symbol
ValueCountFrequency (%)
£ 3
60.0%
¢ 2
40.0%
Space Separator
ValueCountFrequency (%)
4478314
100.0%
Other Symbol
ValueCountFrequency (%)
° 612
100.0%
Other Letter
ValueCountFrequency (%)
º 57
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29912953
82.8%
Common 6226390
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3514056
11.7%
r 2590696
 
8.7%
a 2510661
 
8.4%
o 2468359
 
8.3%
n 2068334
 
6.9%
s 1995921
 
6.7%
t 1847880
 
6.2%
i 1812691
 
6.1%
d 1411076
 
4.7%
l 1409280
 
4.7%
Other values (91) 8283999
27.7%
Common
ValueCountFrequency (%)
4478314
71.9%
. 943776
 
15.2%
, 401980
 
6.5%
- 77753
 
1.2%
; 77327
 
1.2%
0 31850
 
0.5%
( 27667
 
0.4%
) 27591
 
0.4%
1 18139
 
0.3%
' 17235
 
0.3%
Other values (57) 124758
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36039670
99.7%
None 99549
 
0.3%
Punctuation 124
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4478314
 
12.4%
e 3514056
 
9.8%
r 2590696
 
7.2%
a 2510661
 
7.0%
o 2468359
 
6.8%
n 2068334
 
5.7%
s 1995921
 
5.5%
t 1847880
 
5.1%
i 1812691
 
5.0%
d 1411076
 
3.9%
Other values (86) 11341682
31.5%
None
ValueCountFrequency (%)
é 30664
30.8%
ê 27028
27.2%
è 14313
14.4%
à 7116
 
7.1%
á 3678
 
3.7%
ä 2477
 
2.5%
í 1621
 
1.6%
ü 1613
 
1.6%
ú 1252
 
1.3%
ó 1130
 
1.1%
Other values (56) 8657
 
8.7%
Punctuation
ValueCountFrequency (%)
76
61.3%
17
 
13.7%
17
 
13.7%
8
 
6.5%
5
 
4.0%
1
 
0.8%

samplingEffort
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:51.827744image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters4
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row0.0 m
ValueCountFrequency (%)
0.0 1
50.0%
m 1
50.0%
2025-01-14T10:44:51.926881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
40.0%
. 1
20.0%
1
20.0%
m 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
40.0%
Other Punctuation 1
20.0%
Space Separator 1
20.0%
Lowercase Letter 1
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
80.0%
Latin 1
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
50.0%
. 1
25.0%
1
25.0%
Latin
ValueCountFrequency (%)
m 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
40.0%
. 1
20.0%
1
20.0%
m 1
20.0%

continent
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing902365
Missing (%)18.0%
Memory size38.3 MiB
2025-01-14T10:44:51.982177image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length15
Mean length7.327123048
Min length4

Characters and Unicode

Total characters30168821
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEurope
2nd rowAsia
3rd rowEurope
4th rowAsia
5th rowEurope
ValueCountFrequency (%)
asia 1235811
25.9%
europe 1145221
24.0%
africa 713929
14.9%
america 661234
13.8%
southern 417866
 
8.7%
australasia 358600
 
7.5%
central 124578
 
2.6%
north 118790
 
2.5%
antarctica 1561
 
< 0.1%
africa/asia 1061
 
< 0.1%
2025-01-14T10:44:52.102897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3816596
12.7%
r 3542840
11.7%
A 2973257
9.9%
i 2973257
9.9%
e 2348899
 
7.8%
s 1954072
 
6.5%
u 1921687
 
6.4%
o 1681877
 
5.6%
c 1379346
 
4.6%
p 1145221
 
3.8%
Other values (12) 6431769
21.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24726814
82.0%
Uppercase Letter 4779712
 
15.8%
Space Separator 661234
 
2.2%
Other Punctuation 1061
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3816596
15.4%
r 3542840
14.3%
i 2973257
12.0%
e 2348899
9.5%
s 1954072
7.9%
u 1921687
7.8%
o 1681877
6.8%
c 1379346
 
5.6%
p 1145221
 
4.6%
t 1022956
 
4.1%
Other values (5) 2940063
11.9%
Uppercase Letter
ValueCountFrequency (%)
A 2973257
62.2%
E 1145221
 
24.0%
S 417866
 
8.7%
C 124578
 
2.6%
N 118790
 
2.5%
Space Separator
ValueCountFrequency (%)
661234
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1061
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29506526
97.8%
Common 662295
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3816596
12.9%
r 3542840
12.0%
A 2973257
10.1%
i 2973257
10.1%
e 2348899
8.0%
s 1954072
 
6.6%
u 1921687
 
6.5%
o 1681877
 
5.7%
c 1379346
 
4.7%
p 1145221
 
3.9%
Other values (10) 5769474
19.6%
Common
ValueCountFrequency (%)
661234
99.8%
/ 1061
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30168821
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3816596
12.7%
r 3542840
11.7%
A 2973257
9.9%
i 2973257
9.9%
e 2348899
 
7.8%
s 1954072
 
6.5%
u 1921687
 
6.4%
o 1681877
 
5.6%
c 1379346
 
4.6%
p 1145221
 
3.8%
Other values (12) 6431769
21.3%

island
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:52.148704image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row51.41942
ValueCountFrequency (%)
51.41942 1
100.0%
2025-01-14T10:44:52.257306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
4 2
25.0%
5 1
12.5%
. 1
12.5%
9 1
12.5%
2 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
87.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
28.6%
4 2
28.6%
5 1
14.3%
9 1
14.3%
2 1
14.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
4 2
25.0%
5 1
12.5%
. 1
12.5%
9 1
12.5%
2 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
4 2
25.0%
5 1
12.5%
. 1
12.5%
9 1
12.5%
2 1
12.5%
Distinct259
Distinct (%)< 0.1%
Missing375
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:44:52.442773image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length33
Mean length9.385069392
Min length4

Characters and Unicode

Total characters47107483
Distinct characters67
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowFrance
2nd rowIndonesia
3rd rowFrance
4th rowIndonesia
5th rowGreece
ValueCountFrequency (%)
unknown 901458
 
14.7%
netherlands 668769
 
10.9%
indonesia 566953
 
9.2%
new 190960
 
3.1%
guinea 166419
 
2.7%
papua 152664
 
2.5%
brazil 120044
 
2.0%
united 119751
 
2.0%
france 116205
 
1.9%
australia 110396
 
1.8%
Other values (304) 3021381
49.2%
2025-01-14T10:44:52.713491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 6455572
13.7%
a 6144718
 
13.0%
e 3683911
 
7.8%
i 3049965
 
6.5%
o 2676245
 
5.7%
s 2153854
 
4.6%
r 2077292
 
4.4%
d 1855222
 
3.9%
l 1850248
 
3.9%
t 1563534
 
3.3%
Other values (57) 15596922
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 39272562
83.4%
Uppercase Letter 6271710
 
13.3%
Space Separator 1115878
 
2.4%
Other Punctuation 230177
 
0.5%
Close Punctuation 106346
 
0.2%
Open Punctuation 106346
 
0.2%
Dash Punctuation 4458
 
< 0.1%
Decimal Number 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 6455572
16.4%
a 6144718
15.6%
e 3683911
9.4%
i 3049965
 
7.8%
o 2676245
 
6.8%
s 2153854
 
5.5%
r 2077292
 
5.3%
d 1855222
 
4.7%
l 1850248
 
4.7%
t 1563534
 
4.0%
Other values (18) 7762001
19.8%
Uppercase Letter
ValueCountFrequency (%)
U 1043022
16.6%
N 915175
14.6%
I 762328
12.2%
S 624792
10.0%
M 401415
 
6.4%
G 377149
 
6.0%
A 373141
 
5.9%
C 366717
 
5.8%
P 335466
 
5.3%
B 239678
 
3.8%
Other values (14) 832827
13.3%
Decimal Number
ValueCountFrequency (%)
7 2
33.3%
3 1
16.7%
6 1
16.7%
9 1
16.7%
8 1
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 212914
92.5%
, 11185
 
4.9%
& 6077
 
2.6%
. 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 106092
99.8%
] 254
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 106092
99.8%
[ 254
 
0.2%
Space Separator
ValueCountFrequency (%)
1115878
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4458
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45544272
96.7%
Common 1563211
 
3.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 6455572
14.2%
a 6144718
13.5%
e 3683911
 
8.1%
i 3049965
 
6.7%
o 2676245
 
5.9%
s 2153854
 
4.7%
r 2077292
 
4.6%
d 1855222
 
4.1%
l 1850248
 
4.1%
t 1563534
 
3.4%
Other values (42) 14033711
30.8%
Common
ValueCountFrequency (%)
1115878
71.4%
/ 212914
 
13.6%
) 106092
 
6.8%
( 106092
 
6.8%
, 11185
 
0.7%
& 6077
 
0.4%
- 4458
 
0.3%
[ 254
 
< 0.1%
] 254
 
< 0.1%
7 2
 
< 0.1%
Other values (5) 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47095541
> 99.9%
None 11942
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 6455572
13.7%
a 6144718
 
13.0%
e 3683911
 
7.8%
i 3049965
 
6.5%
o 2676245
 
5.7%
s 2153854
 
4.6%
r 2077292
 
4.4%
d 1855222
 
3.9%
l 1850248
 
3.9%
t 1563534
 
3.3%
Other values (55) 15584980
33.1%
None
ValueCountFrequency (%)
ç 9002
75.4%
é 2940
 
24.6%

countryCode
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:52.771637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters10
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS84
2nd rowWGS84
ValueCountFrequency (%)
wgs84 2
100.0%
2025-01-14T10:44:52.865743image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
W 2
20.0%
G 2
20.0%
S 2
20.0%
8 2
20.0%
4 2
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6
60.0%
Decimal Number 4
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 2
33.3%
G 2
33.3%
S 2
33.3%
Decimal Number
ValueCountFrequency (%)
8 2
50.0%
4 2
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
60.0%
Common 4
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 2
33.3%
G 2
33.3%
S 2
33.3%
Common
ValueCountFrequency (%)
8 2
50.0%
4 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 2
20.0%
G 2
20.0%
S 2
20.0%
8 2
20.0%
4 2
20.0%

stateProvince
Text

Missing 

Distinct3223
Distinct (%)0.2%
Missing3065004
Missing (%)61.1%
Memory size38.3 MiB
2025-01-14T10:44:53.056648image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length28
Mean length8.864043385
Min length3

Characters and Unicode

Total characters17327237
Distinct characters108
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique408 ?
Unique (%)< 0.1%

Sample

1st rowSumatra
2nd rowBorneo
3rd rowBorneo
4th rowSumatra
5th rowSumatra
ValueCountFrequency (%)
borneo 230539
 
9.3%
new 206395
 
8.3%
guinea 192672
 
7.8%
java 135629
 
5.5%
sumatra 84195
 
3.4%
region 83893
 
3.4%
northern 54146
 
2.2%
zuid-holland 53025
 
2.1%
gelderland 42250
 
1.7%
sulawesi 38230
 
1.5%
Other values (3222) 1356524
54.8%
2025-01-14T10:44:53.340822image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1947747
 
11.2%
e 1641332
 
9.5%
o 1463019
 
8.4%
n 1452254
 
8.4%
r 1147603
 
6.6%
i 879429
 
5.1%
u 873773
 
5.0%
l 667117
 
3.9%
t 638970
 
3.7%
s 549587
 
3.2%
Other values (98) 6066406
35.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13786801
79.6%
Uppercase Letter 2714365
 
15.7%
Space Separator 522745
 
3.0%
Dash Punctuation 255700
 
1.5%
Open Punctuation 19001
 
0.1%
Close Punctuation 18788
 
0.1%
Other Punctuation 9425
 
0.1%
Decimal Number 236
 
< 0.1%
Final Punctuation 171
 
< 0.1%
Math Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1947747
14.1%
e 1641332
11.9%
o 1463019
10.6%
n 1452254
10.5%
r 1147603
8.3%
i 879429
 
6.4%
u 873773
 
6.3%
l 667117
 
4.8%
t 638970
 
4.6%
s 549587
 
4.0%
Other values (42) 2525970
18.3%
Uppercase Letter
ValueCountFrequency (%)
N 396728
14.6%
S 316558
11.7%
B 299278
11.0%
G 278749
 
10.3%
J 140484
 
5.2%
M 133425
 
4.9%
L 124661
 
4.6%
H 116582
 
4.3%
R 104500
 
3.8%
C 91711
 
3.4%
Other values (23) 711689
26.2%
Decimal Number
ValueCountFrequency (%)
7 56
23.7%
6 53
22.5%
4 49
20.8%
3 37
15.7%
5 17
 
7.2%
8 15
 
6.4%
2 6
 
2.5%
1 3
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 5631
59.7%
' 2710
28.8%
, 1013
 
10.7%
& 34
 
0.4%
? 20
 
0.2%
/ 17
 
0.2%
Space Separator
ValueCountFrequency (%)
522740
> 99.9%
  5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 255531
99.9%
169
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 4
80.0%
+ 1
 
20.0%
Open Punctuation
ValueCountFrequency (%)
( 19001
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18788
100.0%
Final Punctuation
ValueCountFrequency (%)
171
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16501166
95.2%
Common 826071
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1947747
 
11.8%
e 1641332
 
9.9%
o 1463019
 
8.9%
n 1452254
 
8.8%
r 1147603
 
7.0%
i 879429
 
5.3%
u 873773
 
5.3%
l 667117
 
4.0%
t 638970
 
3.9%
s 549587
 
3.3%
Other values (75) 5240335
31.8%
Common
ValueCountFrequency (%)
522740
63.3%
- 255531
30.9%
( 19001
 
2.3%
) 18788
 
2.3%
. 5631
 
0.7%
' 2710
 
0.3%
, 1013
 
0.1%
171
 
< 0.1%
169
 
< 0.1%
7 56
 
< 0.1%
Other values (13) 261
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17185130
99.2%
None 141767
 
0.8%
Punctuation 340
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1947747
 
11.3%
e 1641332
 
9.6%
o 1463019
 
8.5%
n 1452254
 
8.5%
r 1147603
 
6.7%
i 879429
 
5.1%
u 873773
 
5.1%
l 667117
 
3.9%
t 638970
 
3.7%
s 549587
 
3.2%
Other values (62) 5924299
34.5%
None
ValueCountFrequency (%)
é 98312
69.3%
á 15495
 
10.9%
í 6772
 
4.8%
ó 4018
 
2.8%
ô 3957
 
2.8%
ü 3456
 
2.4%
ä 1785
 
1.3%
ã 1771
 
1.2%
è 1140
 
0.8%
ö 899
 
0.6%
Other values (24) 4162
 
2.9%
Punctuation
ValueCountFrequency (%)
171
50.3%
169
49.7%

locality
Text

Missing 

Distinct2397188
Distinct (%)56.3%
Missing763737
Missing (%)15.2%
Memory size38.3 MiB
2025-01-14T10:44:54.403050image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length736849
Median length84356
Mean length47.16343342
Min length1

Characters and Unicode

Total characters200729695
Distinct characters211
Distinct categories19 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1909729 ?
Unique (%)44.9%

Sample

1st rowNice.
2nd rowE. Coast Sumatra, Siak, Indrapura
3rd rowCorsica; Cargèse.
4th rowPatras, op rots, bij ruine.
5th rowWest Borneo, Sintang G. Pahoe
ValueCountFrequency (%)
of 993695
 
3.3%
de 528801
 
1.8%
km 423817
 
1.4%
366885
 
1.2%
in 319701
 
1.1%
the 236036
 
0.8%
road 222131
 
0.7%
near 220755
 
0.7%
bij 193361
 
0.6%
district 189242
 
0.6%
Other values (1181095) 26099389
87.6%
2025-01-14T10:44:55.520499image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25605653
 
12.8%
a 17766400
 
8.9%
e 14813627
 
7.4%
n 11419769
 
5.7%
o 10867963
 
5.4%
i 10448228
 
5.2%
r 10001236
 
5.0%
t 7769321
 
3.9%
. 7553473
 
3.8%
s 6867565
 
3.4%
Other values (201) 77616460
38.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 134893361
67.2%
Space Separator 25605699
 
12.8%
Uppercase Letter 20272038
 
10.1%
Other Punctuation 13323633
 
6.6%
Decimal Number 2817461
 
1.4%
Control 2060911
 
1.0%
Dash Punctuation 719511
 
0.4%
Open Punctuation 478330
 
0.2%
Close Punctuation 476278
 
0.2%
Math Symbol 66580
 
< 0.1%
Other values (9) 15893
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 17766400
13.2%
e 14813627
11.0%
n 11419769
 
8.5%
o 10867963
 
8.1%
i 10448228
 
7.7%
r 10001236
 
7.4%
t 7769321
 
5.8%
s 6867565
 
5.1%
l 6810224
 
5.0%
u 5143259
 
3.8%
Other values (52) 32985769
24.5%
Uppercase Letter
ValueCountFrequency (%)
S 1935677
 
9.5%
P 1549270
 
7.6%
M 1469722
 
7.2%
B 1423966
 
7.0%
N 1251858
 
6.2%
C 1191407
 
5.9%
A 1117672
 
5.5%
R 950359
 
4.7%
T 945685
 
4.7%
L 891981
 
4.4%
Other values (49) 7544441
37.2%
Other Punctuation
ValueCountFrequency (%)
. 7553473
56.7%
, 4275769
32.1%
: 615518
 
4.6%
; 205722
 
1.5%
' 177844
 
1.3%
/ 148168
 
1.1%
! 115191
 
0.9%
* 109104
 
0.8%
" 65200
 
0.5%
? 32465
 
0.2%
Other values (13) 25179
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 514984
18.3%
0 394884
14.0%
2 365497
13.0%
5 337037
12.0%
3 291722
10.4%
4 261157
9.3%
6 215405
7.6%
8 157352
 
5.6%
7 147731
 
5.2%
9 131692
 
4.7%
Math Symbol
ValueCountFrequency (%)
| 22174
33.3%
± 20926
31.4%
= 14977
22.5%
> 3950
 
5.9%
< 2231
 
3.4%
+ 2163
 
3.2%
~ 108
 
0.2%
× 45
 
0.1%
÷ 4
 
< 0.1%
¬ 2
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 5642
77.1%
¼ 984
 
13.4%
¾ 570
 
7.8%
² 92
 
1.3%
³ 29
 
0.4%
¹ 2
 
< 0.1%
Currency Symbol
ValueCountFrequency (%)
¢ 30
53.6%
$ 19
33.9%
¤ 3
 
5.4%
¥ 2
 
3.6%
£ 1
 
1.8%
1
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 432219
90.4%
[ 45283
 
9.5%
530
 
0.1%
207
 
< 0.1%
{ 91
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 87
41.4%
` 86
41.0%
^ 33
 
15.7%
¨ 3
 
1.4%
¯ 1
 
0.5%
Final Punctuation
ValueCountFrequency (%)
» 163
55.3%
82
27.8%
28
 
9.5%
22
 
7.5%
Initial Punctuation
ValueCountFrequency (%)
« 158
70.5%
43
 
19.2%
18
 
8.0%
5
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 719396
> 99.9%
107
 
< 0.1%
8
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 431154
90.5%
] 45061
 
9.5%
} 63
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 5587
99.5%
® 19
 
0.3%
¦ 10
 
0.2%
Space Separator
ValueCountFrequency (%)
25605653
> 99.9%
  46
 
< 0.1%
Control
ValueCountFrequency (%)
2050064
99.5%
10847
 
0.5%
Other Letter
ValueCountFrequency (%)
º 1203
98.0%
ª 25
 
2.0%
Connector Punctuation
ValueCountFrequency (%)
_ 932
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 155166627
77.3%
Common 45563068
 
22.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 17766400
 
11.4%
e 14813627
 
9.5%
n 11419769
 
7.4%
o 10867963
 
7.0%
i 10448228
 
6.7%
r 10001236
 
6.4%
t 7769321
 
5.0%
s 6867565
 
4.4%
l 6810224
 
4.4%
u 5143259
 
3.3%
Other values (113) 53259035
34.3%
Common
ValueCountFrequency (%)
25605653
56.2%
. 7553473
 
16.6%
, 4275769
 
9.4%
2050064
 
4.5%
- 719396
 
1.6%
: 615518
 
1.4%
1 514984
 
1.1%
( 432219
 
0.9%
) 431154
 
0.9%
0 394884
 
0.9%
Other values (78) 2969954
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 200038856
99.7%
None 689579
 
0.3%
Punctuation 1246
 
< 0.1%
Modifier Letters 13
 
< 0.1%
Currency Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25605653
 
12.8%
a 17766400
 
8.9%
e 14813627
 
7.4%
n 11419769
 
5.7%
o 10867963
 
5.4%
i 10448228
 
5.2%
r 10001236
 
5.0%
t 7769321
 
3.9%
. 7553473
 
3.8%
s 6867565
 
3.4%
Other values (87) 76925621
38.5%
None
ValueCountFrequency (%)
é 242306
35.1%
á 51416
 
7.5%
è 49482
 
7.2%
ü 38173
 
5.5%
ö 30031
 
4.4%
í 28900
 
4.2%
ë 25491
 
3.7%
ó 23636
 
3.4%
ä 23245
 
3.4%
ê 23036
 
3.3%
Other values (88) 153863
22.3%
Punctuation
ValueCountFrequency (%)
530
42.5%
207
 
16.6%
107
 
8.6%
82
 
6.6%
76
 
6.1%
74
 
5.9%
43
 
3.5%
40
 
3.2%
28
 
2.2%
22
 
1.8%
Other values (4) 37
 
3.0%
Modifier Letters
ValueCountFrequency (%)
ˆ 13
100.0%
Currency Symbols
ValueCountFrequency (%)
1
100.0%

verbatimElevation
Text

Missing 

Distinct7745
Distinct (%)0.4%
Missing3265629
Missing (%)65.1%
Memory size38.3 MiB
2025-01-14T10:44:55.713463image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length5
Mean length6.313657931
Min length5

Characters and Unicode

Total characters11075122
Distinct characters14
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2081 ?
Unique (%)0.1%

Sample

1st row10.0 m
2nd row600.0 m
3rd row250.0 m
4th row20.0 m
5th row4.0 m
ValueCountFrequency (%)
m 1754153
47.7%
0.0 971270
26.4%
83446
 
2.3%
100.0 30322
 
0.8%
200.0 27380
 
0.7%
50.0 25985
 
0.7%
300.0 21545
 
0.6%
400.0 20709
 
0.6%
500.0 20313
 
0.6%
1000.0 18743
 
0.5%
Other values (3230) 701332
 
19.1%
2025-01-14T10:44:55.974987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3847172
34.7%
1921045
17.3%
. 1837599
16.6%
m 1754153
15.8%
1 378270
 
3.4%
5 310048
 
2.8%
2 249943
 
2.3%
3 159594
 
1.4%
4 133323
 
1.2%
6 114954
 
1.0%
Other values (4) 369021
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5476760
49.5%
Space Separator 1921045
 
17.3%
Other Punctuation 1837599
 
16.6%
Lowercase Letter 1754153
 
15.8%
Dash Punctuation 85565
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3847172
70.2%
1 378270
 
6.9%
5 310048
 
5.7%
2 249943
 
4.6%
3 159594
 
2.9%
4 133323
 
2.4%
6 114954
 
2.1%
7 110378
 
2.0%
8 95036
 
1.7%
9 78042
 
1.4%
Space Separator
ValueCountFrequency (%)
1921045
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1837599
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 1754153
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 85565
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9320969
84.2%
Latin 1754153
 
15.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3847172
41.3%
1921045
20.6%
. 1837599
19.7%
1 378270
 
4.1%
5 310048
 
3.3%
2 249943
 
2.7%
3 159594
 
1.7%
4 133323
 
1.4%
6 114954
 
1.2%
7 110378
 
1.2%
Other values (3) 258643
 
2.8%
Latin
ValueCountFrequency (%)
m 1754153
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11075122
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3847172
34.7%
1921045
17.3%
. 1837599
16.6%
m 1754153
15.8%
1 378270
 
3.4%
5 310048
 
2.8%
2 249943
 
2.3%
3 159594
 
1.4%
4 133323
 
1.2%
6 114954
 
1.0%
Other values (4) 369021
 
3.3%

maximumDistanceAboveSurfaceInMeters
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:56.033468image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAsia
ValueCountFrequency (%)
asia 1
100.0%
2025-01-14T10:44:56.130500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1
25.0%
s 1
25.0%
i 1
25.0%
a 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3
75.0%
Uppercase Letter 1
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1
33.3%
i 1
33.3%
a 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1
25.0%
s 1
25.0%
i 1
25.0%
a 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1
25.0%
s 1
25.0%
i 1
25.0%
a 1
25.0%

decimalLatitude
Text

Missing 

Distinct85312
Distinct (%)4.1%
Missing2925885
Missing (%)58.3%
Memory size38.3 MiB
2025-01-14T10:44:56.345758image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.935163477
Min length3

Characters and Unicode

Total characters14521518
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32905 ?
Unique (%)1.6%

Sample

1st row-2.06667
2nd row0.0
3rd row-2.18333
4th row-2.18333
5th row1.16667
ValueCountFrequency (%)
52.16011 15831
 
0.8%
7.25 9141
 
0.4%
5.83333 8412
 
0.4%
3.08333 7806
 
0.4%
1.0 7629
 
0.4%
6.08333 7109
 
0.3%
5.38333 6962
 
0.3%
5.33333 6857
 
0.3%
52.14714 6321
 
0.3%
5.0 6138
 
0.3%
Other values (80543) 2011691
96.1%
2025-01-14T10:44:56.659942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 2185740
15.1%
. 2093897
14.4%
6 1593621
11.0%
5 1522084
10.5%
1 1433173
9.9%
7 1070771
7.4%
2 1042278
7.2%
8 898646
6.2%
4 733048
 
5.0%
0 727483
 
5.0%
Other values (3) 1220777
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11817460
81.4%
Other Punctuation 2093897
 
14.4%
Dash Punctuation 610144
 
4.2%
Uppercase Letter 17
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 2185740
18.5%
6 1593621
13.5%
5 1522084
12.9%
1 1433173
12.1%
7 1070771
9.1%
2 1042278
8.8%
8 898646
7.6%
4 733048
 
6.2%
0 727483
 
6.2%
9 610616
 
5.2%
Other Punctuation
ValueCountFrequency (%)
. 2093897
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 610144
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14521501
> 99.9%
Latin 17
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 2185740
15.1%
. 2093897
14.4%
6 1593621
11.0%
5 1522084
10.5%
1 1433173
9.9%
7 1070771
7.4%
2 1042278
7.2%
8 898646
6.2%
4 733048
 
5.0%
0 727483
 
5.0%
Other values (2) 1220760
8.4%
Latin
ValueCountFrequency (%)
E 17
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14521518
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 2185740
15.1%
. 2093897
14.4%
6 1593621
11.0%
5 1522084
10.5%
1 1433173
9.9%
7 1070771
7.4%
2 1042278
7.2%
8 898646
6.2%
4 733048
 
5.0%
0 727483
 
5.0%
Other values (3) 1220777
8.4%

decimalLongitude
Text

Missing 

Distinct95316
Distinct (%)4.6%
Missing2925885
Missing (%)58.3%
Memory size38.3 MiB
2025-01-14T10:44:56.894690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length7.304637239
Min length3

Characters and Unicode

Total characters15295158
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35846 ?
Unique (%)1.7%

Sample

1st row100.93333
2nd row112.0
3rd row99.65
4th row99.65
5th row124.58333
ValueCountFrequency (%)
4.49701 15831
 
0.8%
10.41667 7696
 
0.4%
4.05 7530
 
0.4%
3.01667 7109
 
0.3%
4.47406 6117
 
0.3%
5.85874 5829
 
0.3%
106.7913 5061
 
0.2%
4.32798 5000
 
0.2%
4.90993 4858
 
0.2%
4.47863 4793
 
0.2%
Other values (91291) 2024073
96.7%
2025-01-14T10:44:57.205534image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 2153780
14.1%
. 2093896
13.7%
1 2080828
13.6%
6 1818372
11.9%
7 1184132
7.7%
5 1165208
7.6%
4 1075292
7.0%
8 917920
6.0%
0 850927
 
5.6%
9 849460
 
5.6%
Other values (10) 1105343
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12912796
84.4%
Other Punctuation 2093896
 
13.7%
Dash Punctuation 288457
 
1.9%
Lowercase Letter 7
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 2153780
16.7%
1 2080828
16.1%
6 1818372
14.1%
7 1184132
9.2%
5 1165208
9.0%
4 1075292
8.3%
8 917920
7.1%
0 850927
 
6.6%
9 849460
 
6.6%
2 816877
 
6.3%
Lowercase Letter
ValueCountFrequency (%)
a 2
28.6%
h 1
14.3%
i 1
14.3%
l 1
14.3%
n 1
14.3%
d 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
E 1
50.0%
T 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 2093896
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 288457
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15295149
> 99.9%
Latin 9
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 2153780
14.1%
. 2093896
13.7%
1 2080828
13.6%
6 1818372
11.9%
7 1184132
7.7%
5 1165208
7.6%
4 1075292
7.0%
8 917920
6.0%
0 850927
 
5.6%
9 849460
 
5.6%
Other values (2) 1105334
7.2%
Latin
ValueCountFrequency (%)
a 2
22.2%
E 1
11.1%
T 1
11.1%
h 1
11.1%
i 1
11.1%
l 1
11.1%
n 1
11.1%
d 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15295158
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 2153780
14.1%
. 2093896
13.7%
1 2080828
13.6%
6 1818372
11.9%
7 1184132
7.7%
5 1165208
7.6%
4 1075292
7.0%
8 917920
6.0%
0 850927
 
5.6%
9 849460
 
5.6%
Other values (10) 1105343
7.2%

geodeticDatum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:44:57.267451image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters25098895
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS84
2nd rowWGS84
3rd rowWGS84
4th rowWGS84
5th rowWGS84
ValueCountFrequency (%)
wgs84 5019779
100.0%
2025-01-14T10:44:57.363685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
W 5019779
20.0%
G 5019779
20.0%
S 5019779
20.0%
8 5019779
20.0%
4 5019779
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15059337
60.0%
Decimal Number 10039558
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 5019779
33.3%
G 5019779
33.3%
S 5019779
33.3%
Decimal Number
ValueCountFrequency (%)
8 5019779
50.0%
4 5019779
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15059337
60.0%
Common 10039558
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 5019779
33.3%
G 5019779
33.3%
S 5019779
33.3%
Common
ValueCountFrequency (%)
8 5019779
50.0%
4 5019779
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25098895
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 5019779
20.0%
G 5019779
20.0%
S 5019779
20.0%
8 5019779
20.0%
4 5019779
20.0%

coordinateUncertaintyInMeters
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:57.412454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSouth-Western
ValueCountFrequency (%)
south-western 1
100.0%
2025-01-14T10:44:57.513467image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 2
15.4%
e 2
15.4%
S 1
7.7%
o 1
7.7%
u 1
7.7%
h 1
7.7%
- 1
7.7%
W 1
7.7%
s 1
7.7%
r 1
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
76.9%
Uppercase Letter 2
 
15.4%
Dash Punctuation 1
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2
20.0%
e 2
20.0%
o 1
10.0%
u 1
10.0%
h 1
10.0%
s 1
10.0%
r 1
10.0%
n 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
W 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
92.3%
Common 1
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2
16.7%
e 2
16.7%
S 1
8.3%
o 1
8.3%
u 1
8.3%
h 1
8.3%
W 1
8.3%
s 1
8.3%
r 1
8.3%
n 1
8.3%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 2
15.4%
e 2
15.4%
S 1
7.7%
o 1
7.7%
u 1
7.7%
h 1
7.7%
- 1
7.7%
W 1
7.7%
s 1
7.7%
r 1
7.7%

verbatimCoordinates
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:57.567997image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length70
Median length70
Mean length70
Min length70

Characters and Unicode

Total characters70
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSiam [Thailand], Kwae Noi Basin Expedition, near Neeckey, near Wangka.
ValueCountFrequency (%)
near 2
20.0%
siam 1
10.0%
thailand 1
10.0%
kwae 1
10.0%
noi 1
10.0%
basin 1
10.0%
expedition 1
10.0%
neeckey 1
10.0%
wangka 1
10.0%
2025-01-14T10:44:57.678392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
12.9%
a 9
12.9%
e 7
 
10.0%
i 6
 
8.6%
n 6
 
8.6%
, 3
 
4.3%
k 2
 
2.9%
d 2
 
2.9%
r 2
 
2.9%
N 2
 
2.9%
Other values (21) 22
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47
67.1%
Space Separator 9
 
12.9%
Uppercase Letter 8
 
11.4%
Other Punctuation 4
 
5.7%
Close Punctuation 1
 
1.4%
Open Punctuation 1
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
19.1%
e 7
14.9%
i 6
12.8%
n 6
12.8%
k 2
 
4.3%
d 2
 
4.3%
r 2
 
4.3%
o 2
 
4.3%
y 1
 
2.1%
c 1
 
2.1%
Other values (9) 9
19.1%
Uppercase Letter
ValueCountFrequency (%)
N 2
25.0%
W 1
12.5%
E 1
12.5%
S 1
12.5%
B 1
12.5%
K 1
12.5%
T 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
9
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 55
78.6%
Common 15
 
21.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
16.4%
e 7
12.7%
i 6
 
10.9%
n 6
 
10.9%
k 2
 
3.6%
d 2
 
3.6%
r 2
 
3.6%
N 2
 
3.6%
o 2
 
3.6%
W 1
 
1.8%
Other values (16) 16
29.1%
Common
ValueCountFrequency (%)
9
60.0%
, 3
 
20.0%
] 1
 
6.7%
[ 1
 
6.7%
. 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9
12.9%
a 9
12.9%
e 7
 
10.0%
i 6
 
8.6%
n 6
 
8.6%
, 3
 
4.3%
k 2
 
2.9%
d 2
 
2.9%
r 2
 
2.9%
N 2
 
2.9%
Other values (21) 22
31.4%

verbatimSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:57.722502image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row150.0 m
ValueCountFrequency (%)
150.0 1
50.0%
m 1
50.0%
2025-01-14T10:44:57.818721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
28.6%
1 1
14.3%
5 1
14.3%
. 1
14.3%
1
14.3%
m 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
57.1%
Other Punctuation 1
 
14.3%
Space Separator 1
 
14.3%
Lowercase Letter 1
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
50.0%
1 1
25.0%
5 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6
85.7%
Latin 1
 
14.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
33.3%
1 1
16.7%
5 1
16.7%
. 1
16.7%
1
16.7%
Latin
ValueCountFrequency (%)
m 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
28.6%
1 1
14.3%
5 1
14.3%
. 1
14.3%
1
14.3%
m 1
14.3%

geologicalContextID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:57.860944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row15.1
ValueCountFrequency (%)
15.1 1
100.0%
2025-01-14T10:44:57.953320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
50.0%
5 1
25.0%
. 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
75.0%
Other Punctuation 1
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
66.7%
5 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
50.0%
5 1
25.0%
. 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
50.0%
5 1
25.0%
. 1
25.0%

earliestEonOrLowestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:57.997430image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row98.46667
ValueCountFrequency (%)
98.46667 1
100.0%
2025-01-14T10:44:58.091492image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 3
37.5%
9 1
 
12.5%
8 1
 
12.5%
. 1
 
12.5%
4 1
 
12.5%
7 1
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
87.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 3
42.9%
9 1
 
14.3%
8 1
 
14.3%
4 1
 
14.3%
7 1
 
14.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 3
37.5%
9 1
 
12.5%
8 1
 
12.5%
. 1
 
12.5%
4 1
 
12.5%
7 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 3
37.5%
9 1
 
12.5%
8 1
 
12.5%
. 1
 
12.5%
4 1
 
12.5%
7 1
 
12.5%

latestEonOrHighestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:58.136535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowWGS84
ValueCountFrequency (%)
wgs84 1
100.0%
2025-01-14T10:44:58.235442image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
W 1
20.0%
G 1
20.0%
S 1
20.0%
8 1
20.0%
4 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3
60.0%
Decimal Number 2
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 1
33.3%
G 1
33.3%
S 1
33.3%
Decimal Number
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3
60.0%
Common 2
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 1
33.3%
G 1
33.3%
S 1
33.3%
Common
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 1
20.0%
G 1
20.0%
S 1
20.0%
8 1
20.0%
4 1
20.0%
Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:58.283248image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9.5
Mean length9.5
Min length8

Characters and Unicode

Total characters19
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowBakker S
2nd rowPedersen TM
ValueCountFrequency (%)
bakker 1
25.0%
s 1
25.0%
pedersen 1
25.0%
tm 1
25.0%
2025-01-14T10:44:58.392929image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4
21.1%
k 2
10.5%
r 2
10.5%
2
10.5%
B 1
 
5.3%
a 1
 
5.3%
S 1
 
5.3%
P 1
 
5.3%
d 1
 
5.3%
s 1
 
5.3%
Other values (3) 3
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
63.2%
Uppercase Letter 5
26.3%
Space Separator 2
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4
33.3%
k 2
16.7%
r 2
16.7%
a 1
 
8.3%
d 1
 
8.3%
s 1
 
8.3%
n 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
B 1
20.0%
S 1
20.0%
P 1
20.0%
T 1
20.0%
M 1
20.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17
89.5%
Common 2
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4
23.5%
k 2
11.8%
r 2
11.8%
B 1
 
5.9%
a 1
 
5.9%
S 1
 
5.9%
P 1
 
5.9%
d 1
 
5.9%
s 1
 
5.9%
n 1
 
5.9%
Other values (2) 2
11.8%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4
21.1%
k 2
10.5%
r 2
10.5%
2
10.5%
B 1
 
5.3%
a 1
 
5.3%
S 1
 
5.3%
P 1
 
5.3%
d 1
 
5.3%
s 1
 
5.3%
Other values (3) 3
15.8%

bed
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:58.453885image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length32.5
Mean length32.5
Min length26

Characters and Unicode

Total characters65
Distinct characters27
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowPhyscia caesia (Hoffm.) Hampe ex Fürnr.
2nd rowPaullinia elegans Cambess.
ValueCountFrequency (%)
physcia 1
11.1%
caesia 1
11.1%
hoffm 1
11.1%
hampe 1
11.1%
ex 1
11.1%
fürnr 1
11.1%
paullinia 1
11.1%
elegans 1
11.1%
cambess 1
11.1%
2025-01-14T10:44:58.575910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
 
12.3%
7
 
10.8%
e 6
 
9.2%
s 5
 
7.7%
i 4
 
6.2%
m 3
 
4.6%
l 3
 
4.6%
n 3
 
4.6%
. 3
 
4.6%
f 2
 
3.1%
Other values (17) 21
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47
72.3%
Space Separator 7
 
10.8%
Uppercase Letter 6
 
9.2%
Other Punctuation 3
 
4.6%
Close Punctuation 1
 
1.5%
Open Punctuation 1
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
17.0%
e 6
12.8%
s 5
10.6%
i 4
8.5%
m 3
 
6.4%
l 3
 
6.4%
n 3
 
6.4%
f 2
 
4.3%
r 2
 
4.3%
c 2
 
4.3%
Other values (9) 9
19.1%
Uppercase Letter
ValueCountFrequency (%)
P 2
33.3%
H 2
33.3%
F 1
16.7%
C 1
16.7%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 53
81.5%
Common 12
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
15.1%
e 6
 
11.3%
s 5
 
9.4%
i 4
 
7.5%
m 3
 
5.7%
l 3
 
5.7%
n 3
 
5.7%
f 2
 
3.8%
r 2
 
3.8%
P 2
 
3.8%
Other values (13) 15
28.3%
Common
ValueCountFrequency (%)
7
58.3%
. 3
25.0%
) 1
 
8.3%
( 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64
98.5%
None 1
 
1.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
 
12.5%
7
 
10.9%
e 6
 
9.4%
s 5
 
7.8%
i 4
 
6.2%
m 3
 
4.7%
l 3
 
4.7%
n 3
 
4.7%
. 3
 
4.7%
f 2
 
3.1%
Other values (16) 20
31.2%
None
ValueCountFrequency (%)
ü 1
100.0%

typeStatus
Text

Missing 

Distinct14
Distinct (%)< 0.1%
Missing4932431
Missing (%)98.3%
Memory size38.3 MiB
2025-01-14T10:44:58.632045image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length6.994413344
Min length4

Characters and Unicode

Total characters610969
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowholotype
2nd rowisotype
3rd rowtype
4th rowlectotype
5th rowtype
ValueCountFrequency (%)
isotype 39000
44.6%
holotype 14456
 
16.5%
type 14207
 
16.3%
syntype 8771
 
10.0%
lectotype 3004
 
3.4%
paratype 2913
 
3.3%
isolectotype 2782
 
3.2%
isosyntype 1268
 
1.5%
neotype 578
 
0.7%
isoneotype 275
 
0.3%
Other values (4) 97
 
0.1%
2025-01-14T10:44:58.740860image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
y 97390
15.9%
e 94065
15.4%
t 93181
15.3%
p 90361
14.8%
o 78963
12.9%
s 53385
8.7%
i 43399
7.1%
l 20264
 
3.3%
h 14456
 
2.4%
n 10892
 
1.8%
Other values (3) 14613
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 610969
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 97390
15.9%
e 94065
15.4%
t 93181
15.3%
p 90361
14.8%
o 78963
12.9%
s 53385
8.7%
i 43399
7.1%
l 20264
 
3.3%
h 14456
 
2.4%
n 10892
 
1.8%
Other values (3) 14613
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 610969
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
y 97390
15.9%
e 94065
15.4%
t 93181
15.3%
p 90361
14.8%
o 78963
12.9%
s 53385
8.7%
i 43399
7.1%
l 20264
 
3.3%
h 14456
 
2.4%
n 10892
 
1.8%
Other values (3) 14613
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 610969
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y 97390
15.9%
e 94065
15.4%
t 93181
15.3%
p 90361
14.8%
o 78963
12.9%
s 53385
8.7%
i 43399
7.1%
l 20264
 
3.3%
h 14456
 
2.4%
n 10892
 
1.8%
Other values (3) 14613
 
2.4%

identifiedBy
Text

Missing 

Distinct12783
Distinct (%)1.5%
Missing4152104
Missing (%)82.7%
Memory size38.3 MiB
2025-01-14T10:44:58.930439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length54
Mean length11.4119028
Min length1

Characters and Unicode

Total characters9901857
Distinct characters109
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4702 ?
Unique (%)0.5%

Sample

1st rowWood GHS
2nd rowSteenis CGGJ van
3rd rowPereira JT; Wong KM
4th rowAshton PS
5th rowNooteboom HP
ValueCountFrequency (%)
van 89759
 
4.4%
de 47967
 
2.4%
der 26776
 
1.3%
p 26721
 
1.3%
a 25734
 
1.3%
maas 25086
 
1.2%
j 24227
 
1.2%
jongkind 21969
 
1.1%
cch 21965
 
1.1%
d 21201
 
1.0%
Other values (9388) 1699526
83.7%
2025-01-14T10:44:59.205401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1163257
 
11.7%
e 898443
 
9.1%
n 642275
 
6.5%
a 603392
 
6.1%
r 456478
 
4.6%
o 414361
 
4.2%
J 355793
 
3.6%
i 334328
 
3.4%
s 326378
 
3.3%
l 308100
 
3.1%
Other values (99) 4399052
44.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5818627
58.8%
Uppercase Letter 2825370
28.5%
Space Separator 1163257
 
11.7%
Other Punctuation 59304
 
0.6%
Dash Punctuation 34039
 
0.3%
Close Punctuation 602
 
< 0.1%
Open Punctuation 602
 
< 0.1%
Decimal Number 53
 
< 0.1%
Connector Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 898443
15.4%
n 642275
11.0%
a 603392
10.4%
r 456478
 
7.8%
o 414361
 
7.1%
i 334328
 
5.7%
s 326378
 
5.6%
l 308100
 
5.3%
d 275276
 
4.7%
t 207110
 
3.6%
Other values (41) 1352486
23.2%
Uppercase Letter
ValueCountFrequency (%)
J 355793
12.6%
C 236242
 
8.4%
M 234695
 
8.3%
H 210215
 
7.4%
A 199191
 
7.1%
P 176050
 
6.2%
S 167959
 
5.9%
B 159051
 
5.6%
W 151893
 
5.4%
L 108781
 
3.9%
Other values (27) 825500
29.2%
Other Punctuation
ValueCountFrequency (%)
; 54937
92.6%
. 2962
 
5.0%
' 1054
 
1.8%
! 283
 
0.5%
? 56
 
0.1%
: 6
 
< 0.1%
& 5
 
< 0.1%
* 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 15
28.3%
9 10
18.9%
6 9
17.0%
4 8
15.1%
3 4
 
7.5%
2 4
 
7.5%
5 2
 
3.8%
0 1
 
1.9%
Space Separator
ValueCountFrequency (%)
1163257
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34039
100.0%
Close Punctuation
ValueCountFrequency (%)
) 602
100.0%
Open Punctuation
ValueCountFrequency (%)
( 602
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8643997
87.3%
Common 1257860
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 898443
 
10.4%
n 642275
 
7.4%
a 603392
 
7.0%
r 456478
 
5.3%
o 414361
 
4.8%
J 355793
 
4.1%
i 334328
 
3.9%
s 326378
 
3.8%
l 308100
 
3.6%
d 275276
 
3.2%
Other values (78) 4029173
46.6%
Common
ValueCountFrequency (%)
1163257
92.5%
; 54937
 
4.4%
- 34039
 
2.7%
. 2962
 
0.2%
' 1054
 
0.1%
) 602
 
< 0.1%
( 602
 
< 0.1%
! 283
 
< 0.1%
? 56
 
< 0.1%
1 15
 
< 0.1%
Other values (11) 53
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9881797
99.8%
None 20060
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1163257
 
11.8%
e 898443
 
9.1%
n 642275
 
6.5%
a 603392
 
6.1%
r 456478
 
4.6%
o 414361
 
4.2%
J 355793
 
3.6%
i 334328
 
3.4%
s 326378
 
3.3%
l 308100
 
3.1%
Other values (63) 4378992
44.3%
None
ValueCountFrequency (%)
é 5736
28.6%
á 4134
20.6%
í 2324
11.6%
ö 2142
 
10.7%
ü 1281
 
6.4%
è 672
 
3.3%
ñ 660
 
3.3%
ä 525
 
2.6%
ó 423
 
2.1%
ú 334
 
1.7%
Other values (26) 1829
 
9.1%

dateIdentified
Text

Missing 

Distinct16460
Distinct (%)3.8%
Missing4581006
Missing (%)91.3%
Memory size38.3 MiB
2025-01-14T10:44:59.429168image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length10
Mean length10.00016409
Min length10

Characters and Unicode

Total characters4387832
Distinct characters35
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4674 ?
Unique (%)1.1%

Sample

1st row1956/11/22
2nd row1995/09/27
3rd row1968/07/01
4th row1972/06/01
5th row1957/01/18
ValueCountFrequency (%)
1955/03/01 2137
 
0.5%
1972/06/01 2001
 
0.5%
1968/07/01 1800
 
0.4%
2001/12/01 1724
 
0.4%
1995/10/01 1545
 
0.4%
1979/08/01 1473
 
0.3%
1989/08/01 1409
 
0.3%
2000/06/01 1393
 
0.3%
2000/01/01 1358
 
0.3%
2000/12/01 1344
 
0.3%
Other values (16450) 422592
96.3%
2025-01-14T10:44:59.708614image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1100102
25.1%
1 886397
20.2%
/ 877548
20.0%
2 396674
 
9.0%
9 393348
 
9.0%
8 143873
 
3.3%
7 131859
 
3.0%
5 122219
 
2.8%
6 121562
 
2.8%
3 110818
 
2.5%
Other values (25) 103432
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3510192
80.0%
Other Punctuation 877548
 
20.0%
Lowercase Letter 76
 
< 0.1%
Uppercase Letter 9
 
< 0.1%
Math Symbol 5
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 14
18.4%
a 12
15.8%
i 8
10.5%
c 8
10.5%
n 7
9.2%
s 6
7.9%
l 4
 
5.3%
h 3
 
3.9%
o 2
 
2.6%
y 2
 
2.6%
Other values (7) 10
13.2%
Decimal Number
ValueCountFrequency (%)
0 1100102
31.3%
1 886397
25.3%
2 396674
 
11.3%
9 393348
 
11.2%
8 143873
 
4.1%
7 131859
 
3.8%
5 122219
 
3.5%
6 121562
 
3.5%
3 110818
 
3.2%
4 103340
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
L 3
33.3%
P 2
22.2%
S 2
22.2%
C 1
 
11.1%
F 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
/ 877548
100.0%
Math Symbol
ValueCountFrequency (%)
| 5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4387747
> 99.9%
Latin 85
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 14
16.5%
a 12
14.1%
i 8
9.4%
c 8
9.4%
n 7
 
8.2%
s 6
 
7.1%
l 4
 
4.7%
L 3
 
3.5%
h 3
 
3.5%
o 2
 
2.4%
Other values (12) 18
21.2%
Common
ValueCountFrequency (%)
0 1100102
25.1%
1 886397
20.2%
/ 877548
20.0%
2 396674
 
9.0%
9 393348
 
9.0%
8 143873
 
3.3%
7 131859
 
3.0%
5 122219
 
2.8%
6 121562
 
2.8%
3 110818
 
2.5%
Other values (3) 103347
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4387832
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1100102
25.1%
1 886397
20.2%
/ 877548
20.0%
2 396674
 
9.0%
9 393348
 
9.0%
8 143873
 
3.3%
7 131859
 
3.0%
5 122219
 
2.8%
6 121562
 
2.8%
3 110818
 
2.5%
Other values (25) 103432
 
2.4%
Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:59.767193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length6
Min length5

Characters and Unicode

Total characters12
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowFungi
2nd rowPlantae
ValueCountFrequency (%)
fungi 1
50.0%
plantae 1
50.0%
2025-01-14T10:44:59.877750image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 2
16.7%
a 2
16.7%
F 1
8.3%
u 1
8.3%
g 1
8.3%
i 1
8.3%
P 1
8.3%
l 1
8.3%
t 1
8.3%
e 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
83.3%
Uppercase Letter 2
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2
20.0%
a 2
20.0%
u 1
10.0%
g 1
10.0%
i 1
10.0%
l 1
10.0%
t 1
10.0%
e 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
F 1
50.0%
P 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 2
16.7%
a 2
16.7%
F 1
8.3%
u 1
8.3%
g 1
8.3%
i 1
8.3%
P 1
8.3%
l 1
8.3%
t 1
8.3%
e 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 2
16.7%
a 2
16.7%
F 1
8.3%
u 1
8.3%
g 1
8.3%
i 1
8.3%
P 1
8.3%
l 1
8.3%
t 1
8.3%
e 1
8.3%

identificationVerificationStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:44:59.929899image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters16
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowFungi-Ascomycota
ValueCountFrequency (%)
fungi-ascomycota 1
100.0%
2025-01-14T10:45:00.038381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2
12.5%
o 2
12.5%
F 1
 
6.2%
u 1
 
6.2%
n 1
 
6.2%
g 1
 
6.2%
i 1
 
6.2%
- 1
 
6.2%
A 1
 
6.2%
s 1
 
6.2%
Other values (4) 4
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
81.2%
Uppercase Letter 2
 
12.5%
Dash Punctuation 1
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 2
15.4%
o 2
15.4%
u 1
7.7%
n 1
7.7%
g 1
7.7%
i 1
7.7%
s 1
7.7%
m 1
7.7%
y 1
7.7%
t 1
7.7%
Uppercase Letter
ValueCountFrequency (%)
F 1
50.0%
A 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15
93.8%
Common 1
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 2
13.3%
o 2
13.3%
F 1
 
6.7%
u 1
 
6.7%
n 1
 
6.7%
g 1
 
6.7%
i 1
 
6.7%
A 1
 
6.7%
s 1
 
6.7%
m 1
 
6.7%
Other values (3) 3
20.0%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2
12.5%
o 2
12.5%
F 1
 
6.2%
u 1
 
6.2%
n 1
 
6.2%
g 1
 
6.2%
i 1
 
6.2%
- 1
 
6.2%
A 1
 
6.2%
s 1
 
6.2%
Other values (4) 4
25.0%

identificationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:00.093390image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters24
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowLichenes-Lecanoromycetes
ValueCountFrequency (%)
lichenes-lecanoromycetes 1
100.0%
2025-01-14T10:45:00.199816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5
20.8%
c 3
12.5%
L 2
 
8.3%
n 2
 
8.3%
s 2
 
8.3%
o 2
 
8.3%
i 1
 
4.2%
h 1
 
4.2%
- 1
 
4.2%
a 1
 
4.2%
Other values (4) 4
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21
87.5%
Uppercase Letter 2
 
8.3%
Dash Punctuation 1
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
23.8%
c 3
14.3%
n 2
 
9.5%
s 2
 
9.5%
o 2
 
9.5%
i 1
 
4.8%
h 1
 
4.8%
a 1
 
4.8%
r 1
 
4.8%
m 1
 
4.8%
Other values (2) 2
 
9.5%
Uppercase Letter
ValueCountFrequency (%)
L 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
95.8%
Common 1
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
21.7%
c 3
13.0%
L 2
 
8.7%
n 2
 
8.7%
s 2
 
8.7%
o 2
 
8.7%
i 1
 
4.3%
h 1
 
4.3%
a 1
 
4.3%
r 1
 
4.3%
Other values (3) 3
13.0%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5
20.8%
c 3
12.5%
L 2
 
8.3%
n 2
 
8.3%
s 2
 
8.3%
o 2
 
8.3%
i 1
 
4.2%
h 1
 
4.2%
- 1
 
4.2%
a 1
 
4.2%
Other values (4) 4
16.7%

taxonID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:00.250175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCaliciales
2nd rowSapindales
ValueCountFrequency (%)
caliciales 1
50.0%
sapindales 1
50.0%
2025-01-14T10:45:00.350081image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
20.0%
l 3
15.0%
i 3
15.0%
e 2
10.0%
s 2
10.0%
C 1
 
5.0%
c 1
 
5.0%
S 1
 
5.0%
p 1
 
5.0%
n 1
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
90.0%
Uppercase Letter 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
22.2%
l 3
16.7%
i 3
16.7%
e 2
11.1%
s 2
11.1%
c 1
 
5.6%
p 1
 
5.6%
n 1
 
5.6%
d 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
S 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
20.0%
l 3
15.0%
i 3
15.0%
e 2
10.0%
s 2
10.0%
C 1
 
5.0%
c 1
 
5.0%
S 1
 
5.0%
p 1
 
5.0%
n 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
20.0%
l 3
15.0%
i 3
15.0%
e 2
10.0%
s 2
10.0%
C 1
 
5.0%
c 1
 
5.0%
S 1
 
5.0%
p 1
 
5.0%
n 1
 
5.0%

acceptedNameUsageID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:00.403665image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length15.5
Mean length15.5
Min length11

Characters and Unicode

Total characters31
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowLichenes-Physciaceae
2nd rowSapindaceae
ValueCountFrequency (%)
lichenes-physciaceae 1
50.0%
sapindaceae 1
50.0%
2025-01-14T10:45:00.507854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 6
19.4%
a 5
16.1%
c 4
12.9%
i 3
9.7%
h 2
 
6.5%
n 2
 
6.5%
s 2
 
6.5%
L 1
 
3.2%
- 1
 
3.2%
P 1
 
3.2%
Other values (4) 4
12.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27
87.1%
Uppercase Letter 3
 
9.7%
Dash Punctuation 1
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6
22.2%
a 5
18.5%
c 4
14.8%
i 3
11.1%
h 2
 
7.4%
n 2
 
7.4%
s 2
 
7.4%
y 1
 
3.7%
p 1
 
3.7%
d 1
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
L 1
33.3%
P 1
33.3%
S 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30
96.8%
Common 1
 
3.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 6
20.0%
a 5
16.7%
c 4
13.3%
i 3
10.0%
h 2
 
6.7%
n 2
 
6.7%
s 2
 
6.7%
L 1
 
3.3%
P 1
 
3.3%
y 1
 
3.3%
Other values (3) 3
10.0%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 6
19.4%
a 5
16.1%
c 4
12.9%
i 3
9.7%
h 2
 
6.5%
n 2
 
6.5%
s 2
 
6.5%
L 1
 
3.2%
- 1
 
3.2%
P 1
 
3.2%
Other values (4) 4
12.9%

namePublishedInID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:00.567807image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8
Min length7

Characters and Unicode

Total characters16
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowPhyscia
2nd rowPaullinia
ValueCountFrequency (%)
physcia 1
50.0%
paullinia 1
50.0%
2025-01-14T10:45:00.690865image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3
18.8%
a 3
18.8%
P 2
12.5%
l 2
12.5%
h 1
 
6.2%
y 1
 
6.2%
s 1
 
6.2%
c 1
 
6.2%
u 1
 
6.2%
n 1
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
87.5%
Uppercase Letter 2
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3
21.4%
a 3
21.4%
l 2
14.3%
h 1
 
7.1%
y 1
 
7.1%
s 1
 
7.1%
c 1
 
7.1%
u 1
 
7.1%
n 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
P 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3
18.8%
a 3
18.8%
P 2
12.5%
l 2
12.5%
h 1
 
6.2%
y 1
 
6.2%
s 1
 
6.2%
c 1
 
6.2%
u 1
 
6.2%
n 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3
18.8%
a 3
18.8%
P 2
12.5%
l 2
12.5%
h 1
 
6.2%
y 1
 
6.2%
s 1
 
6.2%
c 1
 
6.2%
u 1
 
6.2%
n 1
 
6.2%
Distinct376061
Distinct (%)7.5%
Missing224
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:00.930910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length105
Median length90
Mean length28.47094744
Min length2

Characters and Unicode

Total characters142911572
Distinct characters132
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique133003 ?
Unique (%)2.6%

Sample

1st rowPlantago psyllium L.
2nd rowShorea platycarpa Heim
3rd rowPlantago psyllium L.
4th rowAgathis borneensis Warb.
5th rowPlantago psyllium L.
ValueCountFrequency (%)
l 1223482
 
6.8%
361839
 
2.0%
ex 258858
 
1.4%
var 234755
 
1.3%
blume 178015
 
1.0%
subsp 159935
 
0.9%
dc 110044
 
0.6%
benth 87621
 
0.5%
indet 79377
 
0.4%
miq 74956
 
0.4%
Other values (123549) 15222043
84.6%
2025-01-14T10:45:01.254750image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 13264086
 
9.3%
12971975
 
9.1%
i 10267217
 
7.2%
e 8900034
 
6.2%
r 8009635
 
5.6%
l 7115999
 
5.0%
s 6980933
 
4.9%
o 6762853
 
4.7%
n 6461888
 
4.5%
. 6448256
 
4.5%
Other values (122) 55728696
39.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 106479342
74.5%
Uppercase Letter 13376376
 
9.4%
Space Separator 12971978
 
9.1%
Other Punctuation 6905531
 
4.8%
Open Punctuation 1546480
 
1.1%
Close Punctuation 1546471
 
1.1%
Dash Punctuation 67960
 
< 0.1%
Math Symbol 9008
 
< 0.1%
Decimal Number 8413
 
< 0.1%
Connector Punctuation 7
 
< 0.1%
Other values (2) 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 13264086
12.5%
i 10267217
 
9.6%
e 8900034
 
8.4%
r 8009635
 
7.5%
l 7115999
 
6.7%
s 6980933
 
6.6%
o 6762853
 
6.4%
n 6461888
 
6.1%
u 6433194
 
6.0%
t 5317010
 
5.0%
Other values (49) 26966493
25.3%
Uppercase Letter
ValueCountFrequency (%)
L 1812034
13.5%
C 1196963
 
8.9%
S 1136590
 
8.5%
B 1026418
 
7.7%
P 856932
 
6.4%
M 848111
 
6.3%
A 844635
 
6.3%
H 742314
 
5.5%
D 685066
 
5.1%
R 626066
 
4.7%
Other values (26) 3601247
26.9%
Other Punctuation
ValueCountFrequency (%)
. 6448256
93.4%
& 361669
 
5.2%
' 79999
 
1.2%
, 15080
 
0.2%
" 323
 
< 0.1%
? 144
 
< 0.1%
! 37
 
< 0.1%
/ 15
 
< 0.1%
2
 
< 0.1%
; 2
 
< 0.1%
Other values (4) 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 5692
67.7%
2 1705
 
20.3%
3 195
 
2.3%
0 183
 
2.2%
6 136
 
1.6%
7 131
 
1.6%
4 125
 
1.5%
5 94
 
1.1%
8 77
 
0.9%
9 75
 
0.9%
Math Symbol
ValueCountFrequency (%)
× 8996
99.9%
= 6
 
0.1%
+ 6
 
0.1%
Space Separator
ValueCountFrequency (%)
12971975
> 99.9%
  3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 1539498
99.5%
[ 6982
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 1539489
99.5%
] 6982
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 67960
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%
Initial Punctuation
ValueCountFrequency (%)
4
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 119855718
83.9%
Common 23055854
 
16.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 13264086
 
11.1%
i 10267217
 
8.6%
e 8900034
 
7.4%
r 8009635
 
6.7%
l 7115999
 
5.9%
s 6980933
 
5.8%
o 6762853
 
5.6%
n 6461888
 
5.4%
u 6433194
 
5.4%
t 5317010
 
4.4%
Other values (85) 40342869
33.7%
Common
ValueCountFrequency (%)
12971975
56.3%
. 6448256
28.0%
( 1539498
 
6.7%
) 1539489
 
6.7%
& 361669
 
1.6%
' 79999
 
0.3%
- 67960
 
0.3%
, 15080
 
0.1%
× 8996
 
< 0.1%
] 6982
 
< 0.1%
Other values (27) 15950
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 142730005
99.9%
None 181561
 
0.1%
Punctuation 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 13264086
 
9.3%
12971975
 
9.1%
i 10267217
 
7.2%
e 8900034
 
6.2%
r 8009635
 
5.6%
l 7115999
 
5.0%
s 6980933
 
4.9%
o 6762853
 
4.7%
n 6461888
 
4.5%
. 6448256
 
4.5%
Other values (75) 55547129
38.9%
None
ValueCountFrequency (%)
ü 83317
45.9%
é 43766
24.1%
ö 14267
 
7.9%
× 8996
 
5.0%
ä 7026
 
3.9%
ó 4777
 
2.6%
á 4688
 
2.6%
è 3961
 
2.2%
ø 2863
 
1.6%
ç 864
 
0.5%
Other values (35) 7036
 
3.9%
Punctuation
ValueCountFrequency (%)
4
66.7%
2
33.3%

parentNameUsage
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:01.312985image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6.5
Mean length6.5
Min length6

Characters and Unicode

Total characters13
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowcaesia
2nd rowelegans
ValueCountFrequency (%)
caesia 1
50.0%
elegans 1
50.0%
2025-01-14T10:45:01.415929image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
s 2
15.4%
c 1
 
7.7%
i 1
 
7.7%
l 1
 
7.7%
g 1
 
7.7%
n 1
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
s 2
15.4%
c 1
 
7.7%
i 1
 
7.7%
l 1
 
7.7%
g 1
 
7.7%
n 1
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
s 2
15.4%
c 1
 
7.7%
i 1
 
7.7%
l 1
 
7.7%
g 1
 
7.7%
n 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
23.1%
e 3
23.1%
s 2
15.4%
c 1
 
7.7%
i 1
 
7.7%
l 1
 
7.7%
g 1
 
7.7%
n 1
 
7.7%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing5019780
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:01.460853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowspecies
2nd rowspecies
ValueCountFrequency (%)
species 2
100.0%
2025-01-14T10:45:01.554846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 4
28.6%
e 4
28.6%
p 2
14.3%
c 2
14.3%
i 2
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 4
28.6%
e 4
28.6%
p 2
14.3%
c 2
14.3%
i 2
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 4
28.6%
e 4
28.6%
p 2
14.3%
c 2
14.3%
i 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 4
28.6%
e 4
28.6%
p 2
14.3%
c 2
14.3%
i 2
14.3%
Distinct1414
Distinct (%)< 0.1%
Missing488
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:01.666846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length68
Mean length29.82403601
Min length8

Characters and Unicode

Total characters149695605
Distinct characters60
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique116 ?
Unique (%)< 0.1%

Sample

1st rowPlantae|Lamiales|Plantaginaceae
2nd rowPlantae|Malvales|Dipterocarpaceae
3rd rowPlantae|Lamiales|Plantaginaceae
4th rowPlantae|Cupressales|Araucariaceae
5th rowPlantae|Lamiales|Plantaginaceae
ValueCountFrequency (%)
plantae|fabales|fabaceae 308028
 
6.1%
plantae|asterales|asteraceae 302803
 
6.0%
plantae|poales|poaceae 281272
 
5.6%
plantae|gentianales|rubiaceae 189877
 
3.8%
plantae|poales|cyperaceae 141951
 
2.8%
plantae|lamiales|lamiaceae 116077
 
2.3%
plantae|rosales|rosaceae 114928
 
2.3%
plantae|asparagales|orchidaceae 94113
 
1.9%
plantae|malpighiales|euphorbiaceae 91199
 
1.8%
plantae|malvales|malvaceae 80345
 
1.6%
Other values (1415) 3309436
65.8%
2025-01-14T10:45:01.874597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 30545157
20.4%
e 22993015
15.4%
l 13087306
8.7%
| 10172681
 
6.8%
n 8204441
 
5.5%
t 7522789
 
5.0%
s 7327084
 
4.9%
c 7056672
 
4.7%
P 6334479
 
4.2%
i 5649209
 
3.8%
Other values (50) 30802772
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 123945796
82.8%
Uppercase Letter 15347409
 
10.3%
Math Symbol 10172681
 
6.8%
Dash Punctuation 183013
 
0.1%
Other Punctuation 35969
 
< 0.1%
Space Separator 10735
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 30545157
24.6%
e 22993015
18.6%
l 13087306
10.6%
n 8204441
 
6.6%
t 7522789
 
6.1%
s 7327084
 
5.9%
c 7056672
 
5.7%
i 5649209
 
4.6%
r 4219211
 
3.4%
o 4137100
 
3.3%
Other values (17) 13203812
10.7%
Uppercase Letter
ValueCountFrequency (%)
P 6334479
41.3%
A 1541788
 
10.0%
M 1150202
 
7.5%
C 1027425
 
6.7%
F 966551
 
6.3%
L 791072
 
5.2%
R 773189
 
5.0%
S 593219
 
3.9%
G 453807
 
3.0%
E 432771
 
2.8%
Other values (16) 1282906
 
8.4%
Other Punctuation
ValueCountFrequency (%)
? 21596
60.0%
. 14373
40.0%
Math Symbol
ValueCountFrequency (%)
| 10172681
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 183013
100.0%
Space Separator
ValueCountFrequency (%)
10735
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 139293205
93.1%
Common 10402400
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 30545157
21.9%
e 22993015
16.5%
l 13087306
9.4%
n 8204441
 
5.9%
t 7522789
 
5.4%
s 7327084
 
5.3%
c 7056672
 
5.1%
P 6334479
 
4.5%
i 5649209
 
4.1%
r 4219211
 
3.0%
Other values (43) 26353842
18.9%
Common
ValueCountFrequency (%)
| 10172681
97.8%
- 183013
 
1.8%
? 21596
 
0.2%
. 14373
 
0.1%
10735
 
0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149695604
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 30545157
20.4%
e 22993015
15.4%
l 13087306
8.7%
| 10172681
 
6.8%
n 8204441
 
5.5%
t 7522789
 
5.0%
s 7327084
 
4.9%
c 7056672
 
4.7%
P 6334479
 
4.2%
i 5649209
 
3.8%
Other values (49) 30802771
20.6%
None
ValueCountFrequency (%)
ü 1
100.0%
Distinct5
Distinct (%)< 0.1%
Missing496
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:01.932875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.982224563
Min length5

Characters and Unicode

Total characters35045782
Distinct characters20
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPlantae
2nd rowPlantae
3rd rowPlantae
4th rowPlantae
5th rowPlantae
ValueCountFrequency (%)
plantae 4861934
96.9%
fungi 104468
 
2.1%
chromista 37916
 
0.8%
eubacteria 14458
 
0.3%
protozoa 510
 
< 0.1%
2025-01-14T10:45:02.041311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9791210
27.9%
n 4966402
14.2%
t 4914818
14.0%
e 4876392
13.9%
P 4862444
13.9%
l 4861934
13.9%
i 156842
 
0.4%
u 118926
 
0.3%
F 104468
 
0.3%
g 104468
 
0.3%
Other values (10) 287878
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30026496
85.7%
Uppercase Letter 5019286
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9791210
32.6%
n 4966402
16.5%
t 4914818
16.4%
e 4876392
16.2%
l 4861934
16.2%
i 156842
 
0.5%
u 118926
 
0.4%
g 104468
 
0.3%
r 52884
 
0.2%
o 39446
 
0.1%
Other values (6) 143174
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
P 4862444
96.9%
F 104468
 
2.1%
C 37916
 
0.8%
E 14458
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 35045782
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9791210
27.9%
n 4966402
14.2%
t 4914818
14.0%
e 4876392
13.9%
P 4862444
13.9%
l 4861934
13.9%
i 156842
 
0.4%
u 118926
 
0.3%
F 104468
 
0.3%
g 104468
 
0.3%
Other values (10) 287878
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35045782
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9791210
27.9%
n 4966402
14.2%
t 4914818
14.0%
e 4876392
13.9%
P 4862444
13.9%
l 4861934
13.9%
i 156842
 
0.4%
u 118926
 
0.3%
F 104468
 
0.3%
g 104468
 
0.3%
Other values (10) 287878
 
0.8%

phylum
Text

Missing 

Distinct29
Distinct (%)< 0.1%
Missing4742156
Missing (%)94.5%
Memory size38.3 MiB
2025-01-14T10:45:02.105036image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length22
Mean length13.09837695
Min length3

Characters and Unicode

Total characters3636450
Distinct characters38
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowFungi-Ascomycota
2nd rowFungi-Ascomycota
3rd rowFungi-Ascomycota
4th rowFungi-Ascomycota
5th rowFungi-Ascomycota
ValueCountFrequency (%)
rhodophyta 69790
25.1%
fungi-basidiomycota 52456
18.9%
fungi-ascomycota 45602
16.4%
chlorophyta 45198
16.3%
ochrophyta 32323
11.6%
cyanobacteria 14344
 
5.2%
charophyta 11679
 
4.2%
bacillariophyta 5278
 
1.9%
amoebozoa 445
 
0.2%
oomycota 248
 
0.1%
Other values (19) 263
 
0.1%
2025-01-14T10:45:02.247892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 492110
13.5%
a 381066
 
10.5%
h 323307
 
8.9%
t 277145
 
7.6%
y 277061
 
7.6%
i 228102
 
6.3%
c 196019
 
5.4%
p 164308
 
4.5%
d 122285
 
3.4%
n 112587
 
3.1%
Other values (28) 1062460
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3162532
87.0%
Uppercase Letter 375774
 
10.3%
Dash Punctuation 98144
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 492110
15.6%
a 381066
12.0%
h 323307
10.2%
t 277145
8.8%
y 277061
8.8%
i 228102
 
7.2%
c 196019
 
6.2%
p 164308
 
5.2%
d 122285
 
3.9%
n 112587
 
3.6%
Other values (11) 588542
18.6%
Uppercase Letter
ValueCountFrequency (%)
F 98144
26.1%
C 71252
19.0%
R 69790
18.6%
B 57734
15.4%
A 46053
12.3%
O 32571
 
8.7%
E 92
 
< 0.1%
M 72
 
< 0.1%
P 34
 
< 0.1%
Z 15
 
< 0.1%
Other values (6) 17
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 98144
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3538306
97.3%
Common 98144
 
2.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 492110
13.9%
a 381066
 
10.8%
h 323307
 
9.1%
t 277145
 
7.8%
y 277061
 
7.8%
i 228102
 
6.4%
c 196019
 
5.5%
p 164308
 
4.6%
d 122285
 
3.5%
n 112587
 
3.2%
Other values (27) 964316
27.3%
Common
ValueCountFrequency (%)
- 98144
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3636450
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 492110
13.5%
a 381066
 
10.5%
h 323307
 
8.9%
t 277145
 
7.6%
y 277061
 
7.6%
i 228102
 
6.3%
c 196019
 
5.4%
p 164308
 
4.5%
d 122285
 
3.4%
n 112587
 
3.1%
Other values (28) 1062460
29.2%

class
Text

Missing 

Distinct91
Distinct (%)< 0.1%
Missing4741605
Missing (%)94.5%
Memory size38.3 MiB
2025-01-14T10:45:02.341794image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length26
Mean length15.96505103
Min length6

Characters and Unicode

Total characters4441110
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowLichenes-
2nd rowLichenes-
3rd rowLichenes-
4th rowLichenes-
5th rowLichenes-
ValueCountFrequency (%)
florideophyceae 65290
23.5%
fungi-agaricomycetes 48215
17.3%
phaeophyceae 30367
10.9%
ulvophyceae 29708
10.7%
lichenes-lecanoromycetes 26216
9.4%
chlorophyceae 13735
 
4.9%
cyanophyceae 13043
 
4.7%
charophyceae 7130
 
2.6%
fungi-pezizomycetes 6007
 
2.2%
lichenes 5353
 
1.9%
Other values (82) 33337
12.0%
2025-01-14T10:45:02.504906image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 745781
16.8%
o 397801
 
9.0%
c 389649
 
8.8%
a 323750
 
7.3%
y 285919
 
6.4%
h 262782
 
5.9%
i 255647
 
5.8%
r 180931
 
4.1%
p 175103
 
3.9%
n 153502
 
3.5%
Other values (33) 1270245
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3962494
89.2%
Uppercase Letter 375284
 
8.5%
Dash Punctuation 103108
 
2.3%
Space Separator 224
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 745781
18.8%
o 397801
10.0%
c 389649
9.8%
a 323750
8.2%
y 285919
 
7.2%
h 262782
 
6.6%
i 255647
 
6.5%
r 180931
 
4.6%
p 175103
 
4.4%
n 153502
 
3.9%
Other values (13) 791629
20.0%
Uppercase Letter
ValueCountFrequency (%)
F 135947
36.2%
L 62393
16.6%
A 49691
 
13.2%
C 39125
 
10.4%
P 38446
 
10.2%
U 29773
 
7.9%
B 7002
 
1.9%
S 5253
 
1.4%
D 2966
 
0.8%
T 1654
 
0.4%
Other values (8) 3034
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
- 103108
100.0%
Space Separator
ValueCountFrequency (%)
224
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4337778
97.7%
Common 103332
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 745781
17.2%
o 397801
 
9.2%
c 389649
 
9.0%
a 323750
 
7.5%
y 285919
 
6.6%
h 262782
 
6.1%
i 255647
 
5.9%
r 180931
 
4.2%
p 175103
 
4.0%
n 153502
 
3.5%
Other values (31) 1166913
26.9%
Common
ValueCountFrequency (%)
- 103108
99.8%
224
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4441110
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 745781
16.8%
o 397801
 
9.0%
c 389649
 
8.8%
a 323750
 
7.3%
y 285919
 
6.4%
h 262782
 
5.9%
i 255647
 
5.8%
r 180931
 
4.1%
p 175103
 
3.9%
n 153502
 
3.5%
Other values (33) 1270245
28.6%

order
Text

Missing 

Distinct380
Distinct (%)< 0.1%
Missing143842
Missing (%)2.9%
Memory size38.3 MiB
2025-01-14T10:45:02.699859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length18
Mean length9.414907279
Min length1

Characters and Unicode

Total characters45906523
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)< 0.1%

Sample

1st rowLamiales
2nd rowMalvales
3rd rowLamiales
4th rowCupressales
5th rowLamiales
ValueCountFrequency (%)
poales 469510
 
9.6%
malpighiales 338062
 
6.9%
asterales 336633
 
6.9%
fabales 327880
 
6.7%
lamiales 320256
 
6.6%
gentianales 310878
 
6.4%
rosales 239690
 
4.9%
ericales 188588
 
3.9%
caryophyllales 183655
 
3.8%
sapindales 166440
 
3.4%
Other values (371) 1994349
40.9%
2025-01-14T10:45:02.960420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7993736
17.4%
l 6487296
14.1%
s 6014249
13.1%
e 5900801
12.9%
i 2811732
 
6.1%
o 1694688
 
3.7%
r 1659025
 
3.6%
n 1397350
 
3.0%
p 1231222
 
2.7%
t 1086092
 
2.4%
Other values (39) 9630332
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 41030581
89.4%
Uppercase Letter 4875908
 
10.6%
Other Punctuation 33
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7993736
19.5%
l 6487296
15.8%
s 6014249
14.7%
e 5900801
14.4%
i 2811732
 
6.9%
o 1694688
 
4.1%
r 1659025
 
4.0%
n 1397350
 
3.4%
p 1231222
 
3.0%
t 1086092
 
2.6%
Other values (15) 4754390
11.6%
Uppercase Letter
ValueCountFrequency (%)
M 718332
14.7%
P 707510
14.5%
A 686301
14.1%
L 426986
8.8%
F 382064
7.8%
C 367874
7.5%
G 358040
7.3%
R 326597
6.7%
S 324515
6.7%
E 209320
 
4.3%
Other values (12) 368369
7.6%
Other Punctuation
ValueCountFrequency (%)
? 33
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45906489
> 99.9%
Common 34
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7993736
17.4%
l 6487296
14.1%
s 6014249
13.1%
e 5900801
12.9%
i 2811732
 
6.1%
o 1694688
 
3.7%
r 1659025
 
3.6%
n 1397350
 
3.0%
p 1231222
 
2.7%
t 1086092
 
2.4%
Other values (37) 9630298
21.0%
Common
ValueCountFrequency (%)
? 33
97.1%
1
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45906523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7993736
17.4%
l 6487296
14.1%
s 6014249
13.1%
e 5900801
12.9%
i 2811732
 
6.1%
o 1694688
 
3.7%
r 1659025
 
3.6%
n 1397350
 
3.0%
p 1231222
 
2.7%
t 1086092
 
2.4%
Other values (39) 9630332
21.0%

family
Text

Distinct1406
Distinct (%)< 0.1%
Missing1212
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:03.121500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length25
Mean length10.7858368
Min length1

Characters and Unicode

Total characters54129477
Distinct characters56
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)< 0.1%

Sample

1st rowPlantaginaceae
2nd rowDipterocarpaceae
3rd rowPlantaginaceae
4th rowAraucariaceae
5th rowPlantaginaceae
ValueCountFrequency (%)
fabaceae 308028
 
6.1%
asteraceae 302803
 
6.0%
poaceae 281272
 
5.6%
rubiaceae 189877
 
3.8%
cyperaceae 141951
 
2.8%
lamiaceae 116077
 
2.3%
rosaceae 114928
 
2.3%
orchidaceae 94113
 
1.9%
euphorbiaceae 91199
 
1.8%
malvaceae 80345
 
1.6%
Other values (1398) 3308484
65.8%
2025-01-14T10:45:03.347799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12436459
23.0%
e 11470038
21.2%
c 6036012
11.2%
i 2424988
 
4.5%
r 2326369
 
4.3%
o 2005164
 
3.7%
n 1687186
 
3.1%
l 1617349
 
3.0%
t 1411236
 
2.6%
s 1142952
 
2.1%
Other values (46) 11571724
21.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 48926205
90.4%
Uppercase Letter 5076927
 
9.4%
Dash Punctuation 79905
 
0.1%
Other Punctuation 35933
 
0.1%
Space Separator 10507
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12436459
25.4%
e 11470038
23.4%
c 6036012
12.3%
i 2424988
 
5.0%
r 2326369
 
4.8%
o 2005164
 
4.1%
n 1687186
 
3.4%
l 1617349
 
3.3%
t 1411236
 
2.9%
s 1142952
 
2.3%
Other values (16) 6368452
13.0%
Uppercase Letter
ValueCountFrequency (%)
A 805796
15.9%
P 726079
14.3%
C 582509
11.5%
R 446592
8.8%
M 431232
8.5%
F 344071
6.8%
L 301693
 
5.9%
S 263451
 
5.2%
B 214499
 
4.2%
E 208569
 
4.1%
Other values (16) 752436
14.8%
Other Punctuation
ValueCountFrequency (%)
? 21563
60.0%
. 14370
40.0%
Dash Punctuation
ValueCountFrequency (%)
- 79905
100.0%
Space Separator
ValueCountFrequency (%)
10507
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 54003132
99.8%
Common 126345
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12436459
23.0%
e 11470038
21.2%
c 6036012
11.2%
i 2424988
 
4.5%
r 2326369
 
4.3%
o 2005164
 
3.7%
n 1687186
 
3.1%
l 1617349
 
3.0%
t 1411236
 
2.6%
s 1142952
 
2.1%
Other values (42) 11445379
21.2%
Common
ValueCountFrequency (%)
- 79905
63.2%
? 21563
 
17.1%
. 14370
 
11.4%
10507
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54129477
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12436459
23.0%
e 11470038
21.2%
c 6036012
11.2%
i 2424988
 
4.5%
r 2326369
 
4.3%
o 2005164
 
3.7%
n 1687186
 
3.1%
l 1617349
 
3.0%
t 1411236
 
2.6%
s 1142952
 
2.1%
Other values (46) 11571724
21.4%

genus
Text

Distinct20571
Distinct (%)0.4%
Missing224
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:03.550570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length20
Mean length8.494074976
Min length2

Characters and Unicode

Total characters42636502
Distinct characters57
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3021 ?
Unique (%)0.1%

Sample

1st rowPlantago
2nd rowShorea
3rd rowPlantago
4th rowAgathis
5th rowPlantago
ValueCountFrequency (%)
indet 79377
 
1.6%
carex 59786
 
1.2%
ficus 43081
 
0.9%
rubus 36824
 
0.7%
taraxacum 28101
 
0.6%
hieracium 27463
 
0.5%
cyperus 23409
 
0.5%
salix 21702
 
0.4%
ranunculus 21385
 
0.4%
euphorbia 19128
 
0.4%
Other values (20562) 4659308
92.8%
2025-01-14T10:45:03.809949image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5220873
 
12.2%
i 3825339
 
9.0%
e 2967585
 
7.0%
r 2820232
 
6.6%
o 2777638
 
6.5%
u 2380535
 
5.6%
s 2337197
 
5.5%
n 2234005
 
5.2%
l 2185796
 
5.1%
t 1806151
 
4.2%
Other values (47) 14081151
33.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37536848
88.0%
Uppercase Letter 5019562
 
11.8%
Other Punctuation 79377
 
0.2%
Dash Punctuation 517
 
< 0.1%
Math Symbol 192
 
< 0.1%
Space Separator 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5220873
13.9%
i 3825339
10.2%
e 2967585
 
7.9%
r 2820232
 
7.5%
o 2777638
 
7.4%
u 2380535
 
6.3%
s 2337197
 
6.2%
n 2234005
 
6.0%
l 2185796
 
5.8%
t 1806151
 
4.8%
Other values (17) 8981497
23.9%
Uppercase Letter
ValueCountFrequency (%)
C 691853
13.8%
P 514468
 
10.2%
A 478156
 
9.5%
S 477199
 
9.5%
M 288158
 
5.7%
D 266339
 
5.3%
L 256569
 
5.1%
E 243750
 
4.9%
T 236943
 
4.7%
H 215486
 
4.3%
Other values (16) 1350641
26.9%
Other Punctuation
ValueCountFrequency (%)
. 79377
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 517
100.0%
Math Symbol
ValueCountFrequency (%)
× 192
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 42556410
99.8%
Common 80092
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5220873
 
12.3%
i 3825339
 
9.0%
e 2967585
 
7.0%
r 2820232
 
6.6%
o 2777638
 
6.5%
u 2380535
 
5.6%
s 2337197
 
5.5%
n 2234005
 
5.2%
l 2185796
 
5.1%
t 1806151
 
4.2%
Other values (43) 14001059
32.9%
Common
ValueCountFrequency (%)
. 79377
99.1%
- 517
 
0.6%
× 192
 
0.2%
6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42636307
> 99.9%
None 195
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5220873
 
12.2%
i 3825339
 
9.0%
e 2967585
 
7.0%
r 2820232
 
6.6%
o 2777638
 
6.5%
u 2380535
 
5.6%
s 2337197
 
5.5%
n 2234005
 
5.2%
l 2185796
 
5.1%
t 1806151
 
4.2%
Other values (45) 14080956
33.0%
None
ValueCountFrequency (%)
× 192
98.5%
ë 3
 
1.5%

subgenus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:03.873319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length42
Mean length42
Min length42

Characters and Unicode

Total characters42
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowFimbristylis bisumbellata (Forssk.) Bubani
ValueCountFrequency (%)
fimbristylis 1
25.0%
bisumbellata 1
25.0%
forssk 1
25.0%
bubani 1
25.0%
2025-01-14T10:45:03.975864image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 5
11.9%
s 5
11.9%
b 4
 
9.5%
l 3
 
7.1%
a 3
 
7.1%
3
 
7.1%
F 2
 
4.8%
u 2
 
4.8%
t 2
 
4.8%
r 2
 
4.8%
Other values (10) 11
26.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 33
78.6%
Space Separator 3
 
7.1%
Uppercase Letter 3
 
7.1%
Open Punctuation 1
 
2.4%
Other Punctuation 1
 
2.4%
Close Punctuation 1
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5
15.2%
s 5
15.2%
b 4
12.1%
l 3
9.1%
a 3
9.1%
u 2
 
6.1%
t 2
 
6.1%
r 2
 
6.1%
m 2
 
6.1%
y 1
 
3.0%
Other values (4) 4
12.1%
Uppercase Letter
ValueCountFrequency (%)
F 2
66.7%
B 1
33.3%
Space Separator
ValueCountFrequency (%)
3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36
85.7%
Common 6
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 5
13.9%
s 5
13.9%
b 4
11.1%
l 3
8.3%
a 3
8.3%
F 2
 
5.6%
u 2
 
5.6%
t 2
 
5.6%
r 2
 
5.6%
m 2
 
5.6%
Other values (6) 6
16.7%
Common
ValueCountFrequency (%)
3
50.0%
( 1
 
16.7%
. 1
 
16.7%
) 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 5
11.9%
s 5
11.9%
b 4
 
9.5%
l 3
 
7.1%
a 3
 
7.1%
3
 
7.1%
F 2
 
4.8%
u 2
 
4.8%
t 2
 
4.8%
r 2
 
4.8%
Other values (10) 11
26.2%

specificEpithet
Text

Missing 

Distinct74468
Distinct (%)1.6%
Missing420613
Missing (%)8.4%
Memory size38.3 MiB
2025-01-14T10:45:04.170323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37
Median length23
Mean length9.008492186
Min length2

Characters and Unicode

Total characters41431578
Distinct characters80
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19361 ?
Unique (%)0.4%

Sample

1st rowpsyllium
2nd rowplatycarpa
3rd rowpsyllium
4th rowborneensis
5th rowpsyllium
ValueCountFrequency (%)
vulgaris 23406
 
0.5%
palustris 17443
 
0.4%
arvensis 16770
 
0.4%
indica 15214
 
0.3%
officinalis 15193
 
0.3%
repens 13144
 
0.3%
maritima 12040
 
0.3%
alpina 11689
 
0.3%
tomentosa 11026
 
0.2%
montana 10538
 
0.2%
Other values (74399) 4454027
96.8%
2025-01-14T10:45:04.446680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5674666
13.7%
i 4678236
11.3%
s 3099567
 
7.5%
e 2899018
 
7.0%
r 2736809
 
6.6%
l 2697840
 
6.5%
n 2575102
 
6.2%
u 2534175
 
6.1%
o 2373564
 
5.7%
t 2185739
 
5.3%
Other values (70) 9976862
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 41389612
99.9%
Dash Punctuation 30883
 
0.1%
Math Symbol 8467
 
< 0.1%
Space Separator 1347
 
< 0.1%
Other Punctuation 850
 
< 0.1%
Uppercase Letter 146
 
< 0.1%
Decimal Number 105
 
< 0.1%
Open Punctuation 83
 
< 0.1%
Close Punctuation 81
 
< 0.1%
Initial Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5674666
13.7%
i 4678236
11.3%
s 3099567
 
7.5%
e 2899018
 
7.0%
r 2736809
 
6.6%
l 2697840
 
6.5%
n 2575102
 
6.2%
u 2534175
 
6.1%
o 2373564
 
5.7%
t 2185739
 
5.3%
Other values (26) 9934896
24.0%
Uppercase Letter
ValueCountFrequency (%)
C 55
37.7%
M 43
29.5%
F 8
 
5.5%
I 6
 
4.1%
A 6
 
4.1%
E 6
 
4.1%
D 5
 
3.4%
S 4
 
2.7%
N 3
 
2.1%
W 2
 
1.4%
Other values (6) 8
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 765
90.0%
? 36
 
4.2%
" 32
 
3.8%
! 11
 
1.3%
2
 
0.2%
* 1
 
0.1%
& 1
 
0.1%
/ 1
 
0.1%
% 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 44
41.9%
2 30
28.6%
3 15
 
14.3%
4 4
 
3.8%
7 4
 
3.8%
8 3
 
2.9%
9 3
 
2.9%
5 1
 
1.0%
0 1
 
1.0%
Math Symbol
ValueCountFrequency (%)
× 8464
> 99.9%
= 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1344
99.8%
  3
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 69
83.1%
[ 14
 
16.9%
Close Punctuation
ValueCountFrequency (%)
) 67
82.7%
] 14
 
17.3%
Dash Punctuation
ValueCountFrequency (%)
- 30883
100.0%
Initial Punctuation
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41389758
99.9%
Common 41820
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5674666
13.7%
i 4678236
11.3%
s 3099567
 
7.5%
e 2899018
 
7.0%
r 2736809
 
6.6%
l 2697840
 
6.5%
n 2575102
 
6.2%
u 2534175
 
6.1%
o 2373564
 
5.7%
t 2185739
 
5.3%
Other values (42) 9935042
24.0%
Common
ValueCountFrequency (%)
- 30883
73.8%
× 8464
 
20.2%
1344
 
3.2%
. 765
 
1.8%
( 69
 
0.2%
) 67
 
0.2%
1 44
 
0.1%
? 36
 
0.1%
" 32
 
0.1%
2 30
 
0.1%
Other values (18) 86
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41423019
> 99.9%
None 8553
 
< 0.1%
Punctuation 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5674666
13.7%
i 4678236
11.3%
s 3099567
 
7.5%
e 2899018
 
7.0%
r 2736809
 
6.6%
l 2697840
 
6.5%
n 2575102
 
6.2%
u 2534175
 
6.1%
o 2373564
 
5.7%
t 2185739
 
5.3%
Other values (56) 9968303
24.1%
None
ValueCountFrequency (%)
× 8464
99.0%
ü 38
 
0.4%
ë 21
 
0.2%
ï 9
 
0.1%
ö 6
 
0.1%
é 5
 
0.1%
á 3
 
< 0.1%
  3
 
< 0.1%
ó 1
 
< 0.1%
ä 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Punctuation
ValueCountFrequency (%)
4
66.7%
2
33.3%

infraspecificEpithet
Text

Missing 

Distinct25248
Distinct (%)6.1%
Missing4607995
Missing (%)91.8%
Memory size38.3 MiB
2025-01-14T10:45:04.662713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length22
Mean length9.160235753
Min length1

Characters and Unicode

Total characters3772066
Distinct characters66
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9012 ?
Unique (%)2.2%

Sample

1st rowvelutinata
2nd rowmollis
3rd rowbract brevioribus
4th rowvrieseanum
5th rowcandollei
ValueCountFrequency (%)
angustifolia 2329
 
0.6%
glabra 2075
 
0.5%
pubescens 1991
 
0.5%
vulgaris 1822
 
0.4%
minor 1585
 
0.4%
major 1573
 
0.4%
album 1571
 
0.4%
montana 1497
 
0.4%
alba 1374
 
0.3%
typica 1327
 
0.3%
Other values (25081) 396330
95.9%
2025-01-14T10:45:04.942874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 508708
13.5%
i 418276
11.1%
s 281997
 
7.5%
e 267917
 
7.1%
l 258041
 
6.8%
r 245993
 
6.5%
u 238357
 
6.3%
n 229721
 
6.1%
o 218434
 
5.8%
t 200564
 
5.3%
Other values (56) 904058
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3767394
99.9%
Dash Punctuation 2480
 
0.1%
Space Separator 1689
 
< 0.1%
Other Punctuation 353
 
< 0.1%
Uppercase Letter 96
 
< 0.1%
Open Punctuation 22
 
< 0.1%
Close Punctuation 22
 
< 0.1%
Math Symbol 8
 
< 0.1%
Modifier Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 508708
13.5%
i 418276
11.1%
s 281997
 
7.5%
e 267917
 
7.1%
l 258041
 
6.8%
r 245993
 
6.5%
u 238357
 
6.3%
n 229721
 
6.1%
o 218434
 
5.8%
t 200564
 
5.3%
Other values (28) 899386
23.9%
Uppercase Letter
ValueCountFrequency (%)
B 28
29.2%
A 11
 
11.5%
H 10
 
10.4%
M 9
 
9.4%
L 7
 
7.3%
P 7
 
7.3%
C 5
 
5.2%
V 5
 
5.2%
O 4
 
4.2%
G 2
 
2.1%
Other values (5) 8
 
8.3%
Other Punctuation
ValueCountFrequency (%)
' 192
54.4%
. 128
36.3%
! 19
 
5.4%
& 10
 
2.8%
? 4
 
1.1%
Open Punctuation
ValueCountFrequency (%)
( 12
54.5%
[ 10
45.5%
Close Punctuation
ValueCountFrequency (%)
) 12
54.5%
] 10
45.5%
Dash Punctuation
ValueCountFrequency (%)
- 2480
100.0%
Space Separator
ValueCountFrequency (%)
1689
100.0%
Math Symbol
ValueCountFrequency (%)
× 8
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3767490
99.9%
Common 4576
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 508708
13.5%
i 418276
11.1%
s 281997
 
7.5%
e 267917
 
7.1%
l 258041
 
6.8%
r 245993
 
6.5%
u 238357
 
6.3%
n 229721
 
6.1%
o 218434
 
5.8%
t 200564
 
5.3%
Other values (43) 899482
23.9%
Common
ValueCountFrequency (%)
- 2480
54.2%
1689
36.9%
' 192
 
4.2%
. 128
 
2.8%
! 19
 
0.4%
( 12
 
0.3%
) 12
 
0.3%
[ 10
 
0.2%
] 10
 
0.2%
& 10
 
0.2%
Other values (3) 14
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3771989
> 99.9%
None 77
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 508708
13.5%
i 418276
11.1%
s 281997
 
7.5%
e 267917
 
7.1%
l 258041
 
6.8%
r 245993
 
6.5%
u 238357
 
6.3%
n 229721
 
6.1%
o 218434
 
5.8%
t 200564
 
5.3%
Other values (43) 903981
24.0%
None
ValueCountFrequency (%)
ë 26
33.8%
é 11
14.3%
× 8
 
10.4%
ü 7
 
9.1%
ê 5
 
6.5%
ö 5
 
6.5%
û 4
 
5.2%
á 3
 
3.9%
ó 2
 
2.6%
ï 2
 
2.6%
Other values (3) 4
 
5.2%
Distinct5
Distinct (%)< 0.1%
Missing224
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:05.002876image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.632937601
Min length2

Characters and Unicode

Total characters33294415
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowspecies
2nd rowspecies
3rd rowspecies
4th rowspecies
5th rowspecies
ValueCountFrequency (%)
species 4187406
83.4%
genus 420365
 
8.4%
var 230817
 
4.6%
subsp 148885
 
3.0%
f 32085
 
0.6%
2025-01-14T10:45:05.103880image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 9092947
27.3%
e 8795177
26.4%
p 4336291
13.0%
c 4187406
12.6%
i 4187406
12.6%
u 569250
 
1.7%
g 420365
 
1.3%
n 420365
 
1.3%
. 411787
 
1.2%
v 230817
 
0.7%
Other values (4) 642604
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32882628
98.8%
Other Punctuation 411787
 
1.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 9092947
27.7%
e 8795177
26.7%
p 4336291
13.2%
c 4187406
12.7%
i 4187406
12.7%
u 569250
 
1.7%
g 420365
 
1.3%
n 420365
 
1.3%
v 230817
 
0.7%
a 230817
 
0.7%
Other values (3) 411787
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 411787
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32882628
98.8%
Common 411787
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 9092947
27.7%
e 8795177
26.7%
p 4336291
13.2%
c 4187406
12.7%
i 4187406
12.7%
u 569250
 
1.7%
g 420365
 
1.3%
n 420365
 
1.3%
v 230817
 
0.7%
a 230817
 
0.7%
Other values (3) 411787
 
1.3%
Common
ValueCountFrequency (%)
. 411787
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33294415
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 9092947
27.3%
e 8795177
26.4%
p 4336291
13.0%
c 4187406
12.6%
i 4187406
12.6%
u 569250
 
1.7%
g 420365
 
1.3%
n 420365
 
1.3%
. 411787
 
1.2%
v 230817
 
0.7%
Other values (4) 642604
 
1.9%
Distinct65242
Distinct (%)1.4%
Missing355313
Missing (%)7.1%
Memory size38.3 MiB
2025-01-14T10:45:05.286813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length77
Median length70
Mean length9.034118996
Min length1

Characters and Unicode

Total characters42139368
Distinct characters110
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15043 ?
Unique (%)0.3%

Sample

1st rowL.
2nd rowHeim
3rd rowL.
4th rowWarb.
5th rowL.
ValueCountFrequency (%)
l 1269845
 
17.0%
360987
 
4.8%
ex 256772
 
3.4%
blume 179897
 
2.4%
dc 108295
 
1.4%
benth 85823
 
1.1%
miq 72408
 
1.0%
r.br 65429
 
0.9%
willd 61790
 
0.8%
merr 59114
 
0.8%
Other values (13133) 4949569
66.3%
2025-01-14T10:45:05.565939image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 5991115
 
14.2%
2806041
 
6.7%
e 2684191
 
6.4%
r 1931749
 
4.6%
l 1914225
 
4.5%
L 1600647
 
3.8%
a 1554933
 
3.7%
) 1480364
 
3.5%
( 1480364
 
3.5%
n 1371375
 
3.3%
Other values (100) 19324364
45.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21643314
51.4%
Uppercase Letter 8321416
 
19.7%
Other Punctuation 6375580
 
15.1%
Space Separator 2806041
 
6.7%
Close Punctuation 1480387
 
3.5%
Open Punctuation 1480387
 
3.5%
Dash Punctuation 32237
 
0.1%
Decimal Number 4
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2684191
12.4%
r 1931749
 
8.9%
l 1914225
 
8.8%
a 1554933
 
7.2%
n 1371375
 
6.3%
o 1354940
 
6.3%
i 1277795
 
5.9%
u 1109030
 
5.1%
t 1098013
 
5.1%
h 1075013
 
5.0%
Other values (45) 6272050
29.0%
Uppercase Letter
ValueCountFrequency (%)
L 1600647
19.2%
B 817292
 
9.8%
S 648729
 
7.8%
M 543932
 
6.5%
H 513761
 
6.2%
C 493195
 
5.9%
R 431939
 
5.2%
D 408905
 
4.9%
A 358247
 
4.3%
P 336475
 
4.0%
Other values (26) 2168294
26.1%
Other Punctuation
ValueCountFrequency (%)
. 5991115
94.0%
& 360876
 
5.7%
, 14711
 
0.2%
' 8762
 
0.1%
? 104
 
< 0.1%
! 7
 
< 0.1%
; 2
 
< 0.1%
/ 2
 
< 0.1%
: 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
9 1
25.0%
7 1
25.0%
Close Punctuation
ValueCountFrequency (%)
) 1480364
> 99.9%
] 23
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 1480364
> 99.9%
[ 23
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2806041
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32237
100.0%
Math Symbol
ValueCountFrequency (%)
| 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29964730
71.1%
Common 12174638
28.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2684191
 
9.0%
r 1931749
 
6.4%
l 1914225
 
6.4%
L 1600647
 
5.3%
a 1554933
 
5.2%
n 1371375
 
4.6%
o 1354940
 
4.5%
i 1277795
 
4.3%
u 1109030
 
3.7%
t 1098013
 
3.7%
Other values (81) 14067832
46.9%
Common
ValueCountFrequency (%)
. 5991115
49.2%
2806041
23.0%
) 1480364
 
12.2%
( 1480364
 
12.2%
& 360876
 
3.0%
- 32237
 
0.3%
, 14711
 
0.1%
' 8762
 
0.1%
? 104
 
< 0.1%
] 23
 
< 0.1%
Other values (9) 41
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41970851
99.6%
None 168517
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 5991115
 
14.3%
2806041
 
6.7%
e 2684191
 
6.4%
r 1931749
 
4.6%
l 1914225
 
4.6%
L 1600647
 
3.8%
a 1554933
 
3.7%
) 1480364
 
3.5%
( 1480364
 
3.5%
n 1371375
 
3.3%
Other values (61) 19155847
45.6%
None
ValueCountFrequency (%)
ü 81436
48.3%
é 43052
25.5%
ö 13870
 
8.2%
ä 6673
 
4.0%
á 4502
 
2.7%
ó 4378
 
2.6%
è 3798
 
2.3%
ø 2862
 
1.7%
ê 1148
 
0.7%
ç 862
 
0.5%
Other values (29) 5936
 
3.5%

vernacularName
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:05.621794image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae
ValueCountFrequency (%)
plantae 1
100.0%
2025-01-14T10:45:05.719805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
l 1
16.7%
n 1
16.7%
t 1
16.7%
e 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size38.3 MiB
2025-01-14T10:45:05.762874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters15059337
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICN
2nd rowICN
3rd rowICN
4th rowICN
5th rowICN
ValueCountFrequency (%)
icn 5019779
100.0%
2025-01-14T10:45:05.862779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 5019779
33.3%
C 5019779
33.3%
N 5019779
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15059337
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 5019779
33.3%
C 5019779
33.3%
N 5019779
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 15059337
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 5019779
33.3%
C 5019779
33.3%
N 5019779
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15059337
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 5019779
33.3%
C 5019779
33.3%
N 5019779
33.3%

nomenclaturalStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing5019781
Missing (%)> 99.9%
Memory size38.3 MiB
2025-01-14T10:45:05.908828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPoales
ValueCountFrequency (%)
poales 1
100.0%
2025-01-14T10:45:06.010294image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 1
16.7%
o 1
16.7%
a 1
16.7%
l 1
16.7%
e 1
16.7%
s 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
83.3%
Uppercase Letter 1
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
20.0%
a 1
20.0%
l 1
20.0%
e 1
20.0%
s 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 1
16.7%
o 1
16.7%
a 1
16.7%
l 1
16.7%
e 1
16.7%
s 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 1
16.7%
o 1
16.7%
a 1
16.7%
l 1
16.7%
e 1
16.7%
s 1
16.7%